Open zphang opened 4 years ago
Hi @zphang Thanks a lot for pointing out the issues w/ detailed cases!
The .gitignore file made me miss the lib folder. I've just uploaded my modified conll.py file. For the particular error in your case1, there are a very small number of words in some files that have non-integer indexes. So I filtered them out by: https://github.com/JunjieHu/xtreme/blob/develop/third_party/ud-conversion-tools/lib/conll.py#L28
That warming is because the heuristic conversion breaks down a single tree structure for the sentence. Since we are doing mostly on the POS tagging task, that should be fine. I also commented that warming. https://github.com/JunjieHu/xtreme/blob/develop/third_party/ud-conversion-tools/lib/conll.py#L229
If you use my uploaded file, there should not be such errors. I just test the download script one more time in a fresh new machine.
Hi,
I'm currently running the download script for XTREME. I'm running into some issues with the downloading and preprocessing of the UD data, and wanted to check if some of these are an issue with my setup or an issue with the provided code.
ud-conversion-tools
file$REPO/third_party/ud-conversion-tools/conllu_to_conll.py
. However, the script contains the linefrom lib.conll import CoNLLReader
whereas the
lib
folder fromud-conversion-tools
has not been included in the$REPO/third_party/ud-conversion-tools
folder. I was able to get around this by separately git cloning from https://github.com/coastalcph/ud-conversion-tools and adding that to my PYTHONPATHCase 1.
Case 2.
Case 3.
Thanks!