Closed astanin closed 8 years ago
I don't know how to find the current locale in Haskell and decode the output of getArgs. Probably it is GHC which should handle this.
Probably related GHC bugs:
http://hackage.haskell.org/trac/ghc/ticket/3307 http://hackage.haskell.org/trac/ghc/ticket/3309
I use this patch as a private workaround: http://gist.github.com/423901
Is there a way to make that work with utf8-string 0.3.4? It's currently being shipped in Debian, for instance, and I'd like to be able to be compatible with it if I can.
0.3.4 doesn't have isUTF8Encoded. We can either don't check for UTF8 encoding at all and decode anyway (likely to break for those with other locales), or copy-paste isUTF8Encoded from 0.3.5 under a different name into twidge.
isUTF8Encoded: http://hackage.haskell.org/packages/archive/utf8-string/0.3.6/doc/html/src/Codec-Binary-UTF8-String.html#isUTF8Encoded
... spanish "tildes" (á é í ó ú) are not correctly shown in twitter.com from update twidge command.
... spanish eñe letter (ñ) are not supported also, the ISO code for all this symbols is ISO-8859-15. Hope it helps!
What version of twidge are you using, murrayf? Are you piping the data to twidge on stdin or giving it on the command line?
Hello again John, I'm using version 1.0.2. from ubuntu maverick deb package. This errors come from updating command.
One potential problem is that your system locale is something other than UTF-8. twitter and twidge both are designed to operate with UTF-8 only.
Can you check on that?
... this is the result of locale command for my system: LANG=es_ES.utf8 LC_CTYPE="es_ES.utf8" LC_NUMERIC="es_ES.utf8" LC_TIME="es_ES.utf8" LC_COLLATE="es_ES.utf8" LC_MONETARY="es_ES.utf8" LC_MESSAGES="es_ES.utf8" LC_PAPER="es_ES.utf8" LC_NAME="es_ES.utf8" LC_ADDRESS="es_ES.utf8" LC_TELEPHONE="es_ES.utf8" LC_MEASUREMENT="es_ES.utf8" LC_IDENTIFICATION="es_ES.utf8" LC_ALL=
OK. And are you providing the update as a command-line parameter or on stdin?
command-line parameter. I'm not sure but as you can see in message above LC_ALL= is empty by default, I don't know if that has to be that way.
It doesn't post updates with utf-8 symbols for me too.
A workaround that works for me is to echo something pipelined to twidge (my locale is UTF-8) $ echo "Algo en español" | twidge update
I've hit this bug with version 1.1.2
pipeing workaround works but I've found another problem, if the string contains newlines it gets trimmed.
for example:
TEXT="one
two
three"
echo "$TEXT" | twidge update
just post "one"
Did 09c59c20a35ec5f8ccd71596aad68f52fbd82559 not fix this?
In fact, I believe 09c59c2 should have fixed this.
1.0.2 prints Unicode tweets correctly, but corrupts them when sending an update from the command line
An example: http://twitter.com/jetxee/status/15322966107 Instead of: "... проверка twidge 1.0.2"
Sending an update from the stdin works correctly: