Closed GoogleCodeExporter closed 8 years ago
Cannot reproduce.
I opened http://drugoi.livejournal.com/3971967.html in Firefox and did
copy/paste of all the text into a UTF8 file, then ran
./compact_lang_det_test_chrome0122_2 should_not_be_unk_chrome_8.utf8
and got
ExtLanguage RUSSIAN(80% 1027p), UKRAINIAN(2% 450p), INDONESIAN(0% 637p), 40/45 KB of non-tag letters, Summary: RUSSIAN
SummaryLanguage RUSSIAN at 0 of 46701 2617us (17 MB/sec), should_not_be_unk_chrome_8.utf8
If you are not getting that result, please rerun in your context, setting
kCLDFlagEcho as the flag value in the call to ExtDetectLanguageSummary and send
me stderr (not post or email, which open the possibility of various
svn/web/mail/browser software changing the exact bytes), or run with flags
kCLDFlagHtml | kCLDFlagCr
and send me stderr, or compare to the attached file of the output that I got.
Is it possible that there is an encoding problem and you are not passing clean
UTF-8 to CLD2?
Original comment by dsi...@google.com
on 5 Mar 2014 at 6:26
Attachments:
Seems like we are still using R84. Would this explain the difference?
Original comment by kenjibaheux@chromium.org
on 6 Mar 2014 at 4:19
No R84 does not explain the difference. Please capture the actual bytes sent to
CLD2. Thanks, /dick
Original comment by dsi...@google.com
on 6 Mar 2014 at 9:54
FWIW, I am planning to roll Chromium to the latest CLD2 in the Very Near(TM)
future.
Original comment by andrewha...@chromium.org
on 11 Mar 2014 at 12:44
Re #4: please try the subject URL http://drugoi.livejournal.com/3971967.html
and send the requested debugging output fomr #1 if the detected language is
Unknown. /dick
Original comment by dsi...@google.com
on 11 Mar 2014 at 6:33
current version of Chrome Version 38.0.2125.104 (64-bit) detects Russian and
translates correctly. Closing as Fixed.
Original comment by dsi...@google.com
on 23 Oct 2014 at 8:18
Original issue reported on code.google.com by
kenjibaheux@chromium.org
on 5 Mar 2014 at 6:59