Closed rodrigallardo closed 8 months ago
Hi Rodrigo,
first, note that we have also a newer model in https://github.com/ufal/crac2023-corpipe (the entry we had to the next year's competition, with improved performance), and for that we also released a trained checkpoint.
If you would like to make this older model work, please start by sending me the pip list
of your environment -- to preproduce the problem, I will need the same versions of packages that you use.
Cheers!
Hi @foxik ! Thank you very much for the quick response.
first, note that we have also a newer model in https://github.com/ufal/crac2023-corpipe (the entry we had to the next year's competition, with improved performance), and for that we also released a trained checkpoint.
I'm trying to run 2022's version since I believe I lack the resources to train the 2023 version. However, I'm going to also give it a try :)
If you would like to make this older model work, please start by sending me the pip list of your environment -- to preproduce the problem, I will need the same versions of packages that you use.
Sure. Here it is:
Package Version
---------------------------- -------------------
absl-py 2.1.0
asttokens 2.4.1
astunparse 1.6.3
cachetools 5.3.3
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
decorator 5.1.1
exceptiongroup 1.2.0
executing 2.0.1
filelock 3.13.1
flatbuffers 24.3.6
fsspec 2024.2.0
gast 0.5.4
google-auth 2.28.1
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.62.0
h5py 3.10.0
huggingface-hub 0.21.4
idna 3.6
importlib-metadata 7.0.1
iniconfig 2.0.0
ipdb 0.13.13
ipython 8.18.1
jedi 0.19.1
joblib 1.3.2
keras 2.8.0
Keras-Preprocessing 1.1.2
libclang 16.0.6
Markdown 3.5.2
MarkupSafe 2.1.5
matplotlib-inline 0.1.6
numpy 1.26.4
oauthlib 3.2.2
opt-einsum 3.3.0
packaging 23.2
parso 0.8.3
pexpect 4.9.0
pip 21.1.1
pluggy 1.4.0
prompt-toolkit 3.0.43
protobuf 3.20.3
ptyprocess 0.7.0
pure-eval 0.2.2
pyasn1 0.5.1
pyasn1-modules 0.3.0
pygments 2.17.2
pytest 8.0.2
PyYAML 6.0.1
regex 2023.12.25
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
sacremoses 0.1.1
scipy 1.12.0
sentencepiece 0.2.0
setuptools 56.0.0
six 1.16.0
stack-data 0.6.3
tensorboard 2.8.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.8.0
tensorflow-addons 0.16.1
tensorflow-io-gcs-filesystem 0.36.0
termcolor 2.4.0
tf-estimator-nightly 2.8.0.dev2021122109
tokenizers 0.12.1
tomli 2.0.1
tqdm 4.66.2
traitlets 5.14.1
transformers 4.18.0
typeguard 4.1.5
typing-extensions 4.10.0
udapi 0.3.0
urllib3 2.2.1
wcwidth 0.2.13
werkzeug 3.0.1
wheel 0.42.0
wrapt 1.16.0
zipp 3.17.0
Many thanks!
Unfortunately, I cannot replicate your problem. What I did:
get.sh
in data
subdirectoryvenv/bin/python corpipe.py data/es_ancora/es_ancora --epochs=20 --lazy_adam --learning_rate_decay --crf --batch_size=8 --bert=google/rembert --learning_rate=1e-5 --segment=512 --right=50 --exp=es_ancora_test
and the training started successfully.
I also tried the resampling variant, i.e.
venv/bin/python corpipe.py data/es_ancora/es_ancora --resample 6000 1 --epochs=20 --lazy_adam --learning_rate_decay --crf --batch_size=8 --bert=google/rembert --learning_rate=1e-5 --segment=512 --right=50 --exp=es_ancora_test
and that also worked.
Could you try performing the above steps and report any problems encountered? Cheers!
With that setup, the error no longer happens! The only difference I can see is that I previously used Python 3.9.5 instead of 3.9.7, which is the version you used. After changing the Python version, it worked.
Thank you so much for the help! And again, congratulations for the fabulous work!
Glad that it is working now :+1:
Hi!
First, thank you for making the code for your awesome model open-source.
I'm trying to reproduce your results by rerunning the
corpipe.py
script for the Spanish data (es_ancora
). I have not made any changes to the code, yet I'm facing an error that doesn't allow me to run the whole training script.The error occurs when running the line:
On the method
pipeline()
of the class Dataset. Apparently, the error rises from runningsum(1 for _ in pipeline)
. The traceback is the following:Could anyone provide any help for fixing this error?
Any suggestion is more than welcomed.
Thank you! Rodrigo