Closed RobinRojowiec closed 11 months ago
Hi Robin,
I can't reproduce this issue for Python 3.10 on Linux. Here's a pip freeze of a fresh install I tried just now:
aiosignal==1.3.1
antlr4-python3-runtime==4.9.3
async-timeout==4.0.3
attrs==23.1.0
certifi==2023.11.17
charset-normalizer==3.3.2
cloudpickle==2.2.1
cramming @ file
crcmod==1.7
datasets==2.15.0
dill==0.3.7
dnspython==2.4.2
docopt==0.6.2
einops==0.7.0
evaluate==0.4.1
fastavro==1.9.0
fasteners==0.19
filelock==3.13.1
frozenlist==1.4.0
fsspec==2023.10.0
grpcio==1.59.3
hdfs==2.7.3
httplib2==0.22.0
huggingface-hub==0.19.4
hydra-core==1.3.2
idna==3.6
Jinja2==3.1.2
joblib==1.3.2
Js2Py==0.74
jsonschema==4.20.0
jsonschema-specifications==2023.11.1
MarkupSafe==2.1.3
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
networkx==3.2.1
numpy==1.26.2
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu12==12.1.105
objsize==0.6.1
omegaconf==2.3.0
orjson==3.9.10
packaging==23.2
pandas==2.1.3
proto-plus==1.22.3
protobuf==4.25.1
psutil==5.9.6
pyarrow==14.0.1
pyarrow-hotfix==0.6
pydot==1.4.2
pyjsparser==2.7.1
pymongo==4.6.1
pynvml==11.5.0
pyparsing==3.1.1
python-dateutil==2.8.2
pytz==2023.3.post1
PyYAML==6.0.1
referencing==0.31.1
regex==2023.10.3
requests==2.31.0
responses==0.18.0
rpds-py==0.13.2
safetensors==0.4.1
scikit-learn==1.3.2
scipy==1.11.4
six==1.16.0
sympy==1.12
threadpoolctl==3.2.0
tokenizers==0.15.0
torch==2.1.1
tqdm==4.66.1
transformers==4.35.2
triton==2.1.0
typing_extensions==4.8.0
tzdata==2023.3
tzlocal==5.2
urllib3==2.1.0
xxhash==3.4.1
yarl==1.9.3
zstandard==0.22.0
aiohttp==3.9.1
aiosignal==1.3.1
antlr4-python3-runtime==4.9.3
async-timeout==4.0.3
attrs==23.1.0
certifi==2023.11.17
charset-normalizer==3.3.2
cloudpickle==2.2.1
cramming @ file:///home/jonas/Dropbox/Documents_Hyperion/Python/cramming
crcmod==1.7
datasets==2.15.0
dill==0.3.7
dnspython==2.4.2
docopt==0.6.2
einops==0.7.0
evaluate==0.4.1
fastavro==1.9.0
fasteners==0.19
filelock==3.13.1
frozenlist==1.4.0
fsspec==2023.10.0
grpcio==1.59.3
hdfs==2.7.3
httplib2==0.22.0
huggingface-hub==0.19.4
hydra-core==1.3.2
idna==3.6
Jinja2==3.1.2
joblib==1.3.2
Js2Py==0.74
jsonschema==4.20.0
jsonschema-specifications==2023.11.1
MarkupSafe==2.1.3
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
networkx==3.2.1
numpy==1.26.2
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu12==12.1.105
objsize==0.6.1
omegaconf==2.3.0
orjson==3.9.10
packaging==23.2
pandas==2.1.3
proto-plus==1.22.3
protobuf==4.25.1
psutil==5.9.6
pyarrow==14.0.1
pyarrow-hotfix==0.6
pydot==1.4.2
pyjsparser==2.7.1
pymongo==4.6.1
pynvml==11.5.0
pyparsing==3.1.1
python-dateutil==2.8.2
pytz==2023.3.post1
PyYAML==6.0.1
referencing==0.31.1
regex==2023.10.3
requests==2.31.0
responses==0.18.0
rpds-py==0.13.2
safetensors==0.4.1
scikit-learn==1.3.2
scipy==1.11.4
six==1.16.0
sympy==1.12
threadpoolctl==3.2.0
tokenizers==0.15.0
torch==2.1.1
tqdm==4.66.1
transformers==4.35.2
triton==2.1.0
typing_extensions==4.8.0
tzdata==2023.3
tzlocal==5.2
urllib3==2.1.0
xxhash==3.4.1
yarl==1.9.3
zstandard==0.22.0
Thanks, that worked on a clean environment. Seems the Kaggle env hat incompatible libs installed.
Hey, I'm trying to run your Code and install cramming, but got the following error:
File /opt/conda/lib/python3.10/site-packages/datasets/distributed.py:3 1 from typing import TypeVar ----> 3 from .arrow_dataset import Dataset, _split_by_node_map_style_dataset 4 from .iterable_dataset import IterableDataset, _split_by_node_iterable_dataset 7 DatasetType = TypeVar("DatasetType", Dataset, IterableDataset)
ImportError: cannot import name '_split_by_node_map_style_dataset' from 'datasets.arrow_dataset' (/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py)
Can you publish the
pip freeze
output of your env and also Python version you are using, I suspect a incompatability is the reason.