Closed lintool closed 2 years ago
Let's take this: https://github.com/castorini/pyserini/blob/master/pyserini/tokenize_json_collection.py
And make sure it works for non-English languages - e.g., XLMR, mBERT, etc.
cc/ @keleog
@crystina-z can we close this issue now? seems to have been done?
yea agreed
Let's take this: https://github.com/castorini/pyserini/blob/master/pyserini/tokenize_json_collection.py
And make sure it works for non-English languages - e.g., XLMR, mBERT, etc.
cc/ @keleog