wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
https://arxiv.org/abs/2004.07464
MIT License
559 stars 193 forks source link

How can I use utf-8 encoded data? #64

Closed jorgerodriguezsj closed 3 years ago

jorgerodriguezsj commented 3 years ago

I am using text in Spanish, and when I test the dataset it does not return the Ñ correctly, I have searched through all the code and I cannot find where the line is to avoid this. Any help?

The output:

Address EGAOA Address ESPAOA

It should be:

Address EGAÑA Address ESPAÑA

Thanks

cuongngm commented 3 years ago

change your vocab in files utils/keys.txt based on your language and retrain model