Closed Proyag closed 4 years ago
https://github.com/bitextor/pdf-extract/blob/9a3258b1b517ad4fd185f1720a37eb0700d67e28/src/pdfextract/PDFExtract.java#L1179-L1190 checks _hashSentenceJoin for entries for all languages in doc.langList, and complains about the ones it can't find.
_hashSentenceJoin
doc.langList
The problem is
https://github.com/bitextor/pdf-extract/blob/9a3258b1b517ad4fd185f1720a37eb0700d67e28/src/pdfextract/PDFExtract.java#L1564-L1574 which actually populates _hashSentenceJoin only adds an entry for one language per document.
As a result, for every document, we get "No model for language" warnings for all languages except the most common one, even though the models exist.
https://github.com/bitextor/pdf-extract/blob/9a3258b1b517ad4fd185f1720a37eb0700d67e28/src/pdfextract/PDFExtract.java#L1179-L1190 checks
_hashSentenceJoin
for entries for all languages indoc.langList
, and complains about the ones it can't find.The problem is
https://github.com/bitextor/pdf-extract/blob/9a3258b1b517ad4fd185f1720a37eb0700d67e28/src/pdfextract/PDFExtract.java#L1564-L1574 which actually populates
_hashSentenceJoin
only adds an entry for one language per document.As a result, for every document, we get "No model for language" warnings for all languages except the most common one, even though the models exist.