nathell / clj-tagsoup

A HTML parser for Clojure.
Other
181 stars 22 forks source link

parse-string encoding issue #2

Open jpalmucci opened 12 years ago

jpalmucci commented 12 years ago

On osx, using the default encoding / decoding settings does not work. I've changed it to use utf-8.

plexus commented 8 years ago

I ran into this as well (on Linux). The problem is the JVM doesn't default to UTF-8, you can check this on the REPL

(System/getProperty "file.encoding")
;;=> "ANSI_X3.4-1968"

The solution is to set the file.encoding property as the JVM starts.

java -Dfile.encoding=UTF-8 ....

For leiningen you can set :jvm-opts

https://github.com/technomancy/leiningen/blob/master/sample.project.clj#L264-L265