Closed nicno90 closed 3 years ago
Thanks for reporting it @nicno90 I will look into it and see why it is coming.
@R1j1t , and also it will be nice if top_n is configurable
I am also getting similar errors. Getting ##net
and ##ER
as corrections.
@saheel1115 can you send be the following:
Input sentence:
Output:
Expected:
I will try to fix it by this weekend.
@nicno90 I have fixed this issue in PR #36. So for your input
Input: Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the haves and the have nots.
Output: Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the have and the havets.
This is what will come after fixing the detokenization in my code. Regarding your question about such tokens(##x), please have a look here: https://github.com/google/sentencepiece
I will release the latest package to pip over the weekend. Thanks for pointing out this issue also please feel free contribute!
Latest package released on PyPi! Release: https://github.com/R1j1t/contextualSpellCheck/releases/tag/v0.3.3 PyPi Link: https://pypi.org/project/contextualSpellCheck/
Describe the bug Words tagged as incorrect are replaced with a word with hashtags.
To Reproduce
Expected behavior 'Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the have and the have nots.' or 'Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the have and the have not.'
Version:
Additional information I checked the vocab.txt and there are words with ## in the word. I am wondering what the need for these are.