kermitt2 / grobid

A machine learning software for extracting information from scholarly documents
https://grobid.readthedocs.io
Apache License 2.0
3.44k stars 443 forks source link

Wrong extraction of the ï #115

Open AlainMonteil opened 8 years ago

AlainMonteil commented 8 years ago

depuis un pdf "Loïg" est transformé en :

Lo¨ıglo¨ıg

et pas affché Alain

kermitt2 commented 8 years ago

Thanks Alain, if I remember well, that's a particular case where the diacritics is not following the character to be modified but before (in the PDF tree). I'll try to fix it quickly.