Closed GoogleCodeExporter closed 9 years ago
TLD test fixed
1 byte overshoots all from a missing end test; added
Thanks for the help. Please recheck with valgrind if easy. /dick
Original comment by dsi...@google.com
on 24 Aug 2013 at 7:06
Hi, thanks for working on this.
I'm now seeing a valgrind error at a different spot in getonescriptspan.cc:
==8417== Invalid read of size 1
==8417== at 0x430FEA: CLD2::ScriptScanner::GetOneScriptSpan(CLD2::LangSpan*)
(getonescriptspan.cc:1013)
==8417== by 0x43123E:
CLD2::ScriptScanner::GetOneScriptSpanLower(CLD2::LangSpan*)
(getonescriptspan.cc:1075)
==8417== by 0x429439: CLD2::DetectLanguageSummaryV2(char const*, int, bool,
CLD2::CLDHints const*, bool, int, CLD2::Language, CLD2::Language*, int*,
double*, std::__1::vector<CLD2::ResultChunk,
std::__1::allocator<CLD2::ResultChunk> >*, int*, bool*)
(compact_lang_det_impl.cc:1707)
==8417== by 0x42506E: CLD2::DetectLanguageSummary(char const*, int, bool,
char const*, int, CLD2::Language, CLD2::Language*, int*, int*, bool*)
(compact_lang_det.cc:133)
==8417== by 0x41B188: codulus::GetLanguage(codulus::Slice const&,
codulus::Slice const&, CLD2::Language*, int*, int*, bool*)
(language_detection.cc:29)
==8417== by 0x4056DC: codulus::main(int, char**)
(test_language_detection.cc:41)
==8417== by 0x405EE1: main (test_language_detection.cc:63)
==8417== Address 0x6ac1afd is 0 bytes after a block of size 7,965 alloc'd
==8417== at 0x4C2B6CD: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8417== by 0x5F9ED9: operator new(unsigned long) (new.cpp:50)
==8417== by 0x40A842: codulus::Arena::AllocateNewBlock(unsigned long)
(arena.cc:70)
==8417== by 0x40A7A2: codulus::Arena::AllocateFallback(unsigned long)
(arena.cc:36)
==8417== by 0x41584F: codulus::Arena::Allocate(unsigned long) (arena.h:65)
==8417== by 0x41F775: codulus::InterchangeValidUTF8(codulus::Slice const&,
codulus::Arena*, unsigned long*) (print.cc:204)
==8417== by 0x40567F: codulus::main(int, char**)
(test_language_detection.cc:33)
==8417== by 0x405EE1: main (test_language_detection.cc:63)
==8417==
Maybe the while condition on line 1013 should be:
'while ((0 < take) && (take < byte_length_) && ((next_byte_[take] & 0xc0) == 0x80))'
?
Also, for the tld hint fix in compact_lang_det_hint_code.cc, might it have an
issue with two-letter tlds? Our internal fix for this guy was to change the
strncpy's length argument from 'len' to 'len + 1'.
Original comment by cha...@gmail.com
on 24 Aug 2013 at 9:19
Not fully awake yesterday. Your first suggestion added; second changed to
actual buffer size, as it should have been all along. /dick
Original comment by dsi...@google.com
on 25 Aug 2013 at 5:17
This looks great and fixes the valgrind errors we are seeing. Thanks!
P.S.
I think it'd be safe to remove the manual null termination at
compact_lang_det_hint_code.cc:1451 now, if you want.
Original comment by cha...@gmail.com
on 26 Aug 2013 at 7:48
Original issue reported on code.google.com by
cha...@gmail.com
on 21 Aug 2013 at 6:04