Open cleong110 opened 9 months ago
Looked up how to diff two files in Windows, got this:
(sign_language_datasets_source) C:\Users\Colin\projects\sign-language\datasets>FC C:\Users\Colin\projects\sign-language\datasets\sign_language_datasets\datasets\foo_dataset\TAGS.txt C:\Users\Colin\projects\sign-language\datasets\sign_language_datasets\datasets\new_dataset\TAGS.txt
Comparing files C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\FOO_DATASET\TAGS.txt and C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\NEW_DATASET\TAGS.TXT
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\FOO_DATASET\TAGS.txt
content.language.gl # Contains text in language Galician / gl.
content.language.gn # Contains text in language Guaranφ.
content.language.got # Contains text in language Gothic.
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\NEW_DATASET\TAGS.TXT
content.language.gl # Contains text in language Galician / gl.
content.language.gn # Contains text in language Guaran�.
content.language.got # Contains text in language Gothic.
*****
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\FOO_DATASET\TAGS.txt
content.language.gub # Contains text in language Guajajara.
content.language.gun # Contains text in language Mbyß Guaranφ (Tupian).
content.language.ha # Contains text in language Hausa / ha.
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\NEW_DATASET\TAGS.TXT
content.language.gub # Contains text in language Guajajara.
content.language.gun # Contains text in language Mby� Guaran� (Tupian).
content.language.ha # Contains text in language Hausa / ha.
*****
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\FOO_DATASET\TAGS.txt
content.language.my # Contains text in language Burmese / my.
content.language.myu # Contains text in language Munduruk·.
content.language.myv # Contains text in language Erzya.
content.language.nb # Contains text in language Bokmσl, Norwegian.
content.language.ne # Contains text in language Nepali (macrolanguage) / ne.
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\NEW_DATASET\TAGS.TXT
content.language.my # Contains text in language Burmese / my.
content.language.myu # Contains text in language Munduruk�.
content.language.myv # Contains text in language Erzya.
content.language.nb # Contains text in language Bokm�l, Norwegian.
content.language.ne # Contains text in language Nepali (macrolanguage) / ne.
*****
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\FOO_DATASET\TAGS.txt
content.language.sm # Contains text in language Samoan.
content.language.sme # Contains text in language North Sßmi.
content.language.sms # Contains text in language Skolt Sami.
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\NEW_DATASET\TAGS.TXT
content.language.sm # Contains text in language Samoan.
content.language.sme # Contains text in language North S�mi.
content.language.sms # Contains text in language Skolt Sami.
*****
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\FOO_DATASET\TAGS.txt
content.language.tl # Contains text in language Tagalog / tl.
content.language.tpn # Contains text in language Tupi(nambß).
content.language.tr # Contains text in language Turkish / tr.
***** C:\USERS\COLIN\PROJECTS\SIGN-LANGUAGE\DATASETS\SIGN_LANGUAGE_DATASETS\DATASETS\NEW_DATASET\TAGS.TXT
content.language.tl # Contains text in language Tagalog / tl.
content.language.tpn # Contains text in language Tupi(namb�).
content.language.tr # Contains text in language Turkish / tr.
*****
foo is the one that causes crashes, new is the one that does not. Looks like my Windows/my version of epath can't handle some of the symbols? But VS Code or whatever will just change them when resaving
Tried on an Ubuntu machine, no issue at all. Runs fine.
Followed steps from #56 to get pytest running, and then used instructions from https://tensorflow.google.cn/datasets/add_dataset?hl=en#test_your_dataset to create a new dataset.
Then I get errors like this:
with tracebacks going to
abstract_path.py
I am on Windows, potentially this is an issue only with that, because when I do the same steps on Colab it does not occur. https://colab.research.google.com/drive/1X9sem_qFHNHgpRl-IqkHN0Mft8CBCp_O?usp=sharing
I went to abstract_path.py and manually edited it to dump to a .txt file
original:
edited:
output:
This lead me to finally realize that what it actually wanted me to do, I think, was remove invalid tags?
So I opened up Tags.txt to have a look, closed it, and then ran the pytest again... and got a new error:
Apparently opening and closing Tags.txt made the error go away? I theorize it's something to do with the formatting of the .txt file on Windows