Closed tomasbm01 closed 7 months ago
Thanks for this contribution.
The first character of the wordlist.tsv file is U+FEFF (a zero-width no-break space, used as a byte order mark). Unfortunately this character gets included with the first word in the list. You'll need to remove that character. The file should have UTF-8 encoding, but with no BOM. Perhaps you only need to open the file and resave it with that format.
It would be helpful to have a description in the .model_info file. Right now it just has the default "generated from template" phrase. The aim is to include a sentence or two of information will enable a user to find this lexical model when searching.
Let me know if you have any questions.
Thank you. I just fixed the wordlist.tsv
and added a description.
The description change looks good as does the removal of the U+FEFF character. Unfortunately the tab character in the first line of wordlist.tsv has now been replaced by two spaces. I think that will prevent that word from being included. Can you replace those spaces with a tab?
Thanks!
This pull request is from an external repo and will not automatically be built. The build must still be passed before it can be merged. Ask one of the team members to make a manual build of this PR.