Closed davidfuhry closed 7 years ago
Would like to add that this PR fixes #50. Probably more straightforward solution will be:
con = file(file.path(out.dir, "lda.json"), encoding = "UTF-8")
on.exit(close.connection(con))
cat(json, file = con)
@dselivanov You're right, also seems to be the more consistent way of doing it. I've updated the code. Also I rolled back the changes that I made to the documentation as I'm an idiot and missed that it was auto generated.
looks good to me, @kshirley?
Yep, looks good to me, too! Thanks @davidfuhry and @dselivanov .
A few things before we merge:
use.utf8
(or evenutf8
) over use.utf-8
utf8=TRUE
?@cpsievert I've updated the Version and the NEWS and renamed the argument to utf8
.
As for making it default I couldn't think of a reason why it would be bad (after all the generated html is set to utf-8 encoding anyways) tough the issue itself is pretty specific to windows so I'm not sure if that would have any negative side effects when running on linux or os x.
Hmm, now I'm not totally sure we need to add an explicit argument for this. Couldn't #50 be fixed (without these changes) by doing:
options(encoding = "UTF-8")
Before calling serVis()
?
Just tested it, that actually fixes the issue.
I guess that makes these changes obsolete, or would there be any advantage (apart from independence from the global configuration, if that's an advantage) in having a seperate argument?
It would be more transparent to have, say, an encoding
argument which is passed to file()
, but I'm leaning towards just educating users about that options()
workaround.
There is. Usually global option is not set to utf8. For example in text2vec I got a lot of bug reports related to this issue.
13 июн. 2017 г. 11:24 ПП пользователь "David Fuhry" < notifications@github.com> написал:
Just tested it, that actually fixes the issue.
I guess that makes these changes obsolete, or would there be any advantage (apart from independence from the global configuration, if that's an advantage) in having a seperate argument?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cpsievert/LDAvis/pull/73#issuecomment-308221875, or mute the thread https://github.com/notifications/unsubscribe-auth/AE4u3cJViimYwq7AUgCTErtxvof_i3WMks5sDuH0gaJpZM4NpOPu .
@dselivanov You're properbly right again, I guess there is a reason why the file
function has an encoding
argument in the first place and doesn't just allways use the global default.
@cpsievert I've changed the argument to work pretty much like the encoding
argument of the file
function. This is the most streamline version there is, i guess.
Thanks @davidfuhry, looks to good to me, ok with you @kshirley?
Yep, looks good to me!
what is the drawback of just setting it to "utf8" by default?
what is the drawback of just setting it to "utf8" by default?
Backwards incompatibility. It could potentially break for others using a different encoding.
Ok, thanks, make sense.
14 июн. 2017 г. 7:01 ПП пользователь "Carson Sievert" < notifications@github.com> написал:
what is the drawback of just setting it to "utf8" by default?
Backwards incompatibility. It could potentially break for others using a different encoding.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cpsievert/LDAvis/pull/73#issuecomment-308459437, or mute the thread https://github.com/notifications/unsubscribe-auth/AE4u3fsVQxMGBDKki-V5duKRYTUxI72-ks5sD_XRgaJpZM4NpOPu .
When on windows r does not encode created files as utf-8 by default (see this stackoverflow post). This is rather annoying when working with texts that are not in english e.g. german where there are a lot of umlaute ('ö', 'ä' etc.) To work around this i've added the
use.bytes
argument which will makeserVis
usewriteLines
withuseBytes = TRUE
if set. This works around the problem and encodes the json file in utf-8.I've added documentation for this argument in
man/serVis.R
but as i'm fairly new to r librarys i might have missed something.