Open cleong110 opened 8 months ago
OK, installed lxml and now I'm getting "Failed to get url https://nlp.biu.ac.il/~amit/datasets/dgs.json. HTTP code: 404.", which seems new but unrelated to this never mind, python 3.11 seems to work fine, I was installing in my conda base env
So it really does seem that Python 3.12 is the issue, as noted in https://github.com/tensorflow/datasets/issues/4666.
Never mind the nevermind, if you have python 3.11 you need to manually install lxml or dgs corpus downloading crashes when using default config. But that's a DGS-corpus-specific issue I suppose, so never mind the neverminding of the nevermind maybe?
Thanks for this.
According to https://github.com/tensorflow/datasets/issues/4666#issuecomment-2149200103, this is now fixed in the latest version of tfds
.
If we can confirm that, we can close this issue.
Gave it a go. New conda env, python 3.12, pip install sign_language_datasets
. Ended up with tfds-nightly-4.9.5.dev202406050044
, not 4.9.6, the version of tfds which supposedly solves this.
Did some shenanigans - uninstalled tfds-nightly
, and then pip install tensorflow-datasets
, and then it couldn't import it, so pip install -U --force-reinstall tensorflow-datasets
and then now it seems to work.
Steps to reproduce on my own machine:
It works in colab (Python 3.10), but not on my machine in an env with python 3.12. When I create a conda env with 3.10 it works without issue.
When I create an env with 3.11, I get "no module named lxml" but that's a different issueedit: I was installing in my base environment, never mind this parthttps://github.com/tensorflow/datasets/issues/4666 upstream issue, apparently.