Test.py output UTF-8 - Githubissues

wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

https://arxiv.org/abs/2004.07464

MIT License

560 stars 193 forks source link

Test.py output UTF-8 #80

Closed jorgerodriguezsj closed 3 years ago

jorgerodriguezsj commented 3 years ago

I am using text in Spanish, and when I test the dataset it does not return the Ñ correctly, I have searched through all the code and I cannot find where the line is to avoid this. Any help?

The output:

Address EGAOA Address ESPAOA

It should be:

Address EGAÑA Address ESPAÑA

Thanks

AtulKumar4 commented 3 years ago

@jorgerodriguezsj how did you solve it?

jorgerodriguezsj commented 3 years ago

Changing vocab in utils/keys.txt (Add Ñ or what you want based on your language) and retrain the model

AtulKumar4 commented 3 years ago

Changing vocab in utils/keys.txt (Add Ñ or what you want based on your language) and retrain the model

Thanks