Closed haseeb33 closed 2 years ago
Hi @haseeb33,
Sorry for being late. I'm still maintaining it but may not respond in time. I updated the code, mostly about integrating the new pre-processing pipeline from OpenNMT v2, so you can directly load JSON data from disk, without the need of processing it to tensor files first. You can find some config examples here using word tokenization (vocab download) and here using RoBERTa subtokens, for both RNN and Transformer. Hope this helps.
Thanks, Rui
The shared vocab is in .pt format whereas the new implementation requires vocab in JSON format.
Please guide me in this regard! Thanks!
I am having many issues.
Any comments from the developers would be highly appreciated.