Open ZornWang opened 9 months ago
BTW, I can't run the examples 'examples/train_comet_bart.py' directly, it will report the following error:
ValueError: Usecols do not match columns, columns expected but not found: ['relation', 'tails', 'head']
Here is the code that went wrong:
train_graph = KnowledgeGraph.from_csv(
"data/atomic2020/train.tsv", header=None, sep="\t"
)
It seems like header=None
doesn't work
It seems like kogito or source code of COMET use the old version of pytorch which is not support the Ampere architecture GPUs, the code can run on V100 which belongs to Volta architecture.
But it still has some problem with numpy==1.24.1
, it will got following errors:
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2026,) + inhomogeneous part.
Replace numpy==1.24.1
it with older version like numpy==1.21.2
can solve this problem but still have warrings like this:
<__array_function__ internals>:5: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
Still, it works.
BTW, I can't run the examples 'examples/train_comet_bart.py' directly, it will report the following error:
ValueError: Usecols do not match columns, columns expected but not found: ['relation', 'tails', 'head']
Here is the code that went wrong:
train_graph = KnowledgeGraph.from_csv( "data/atomic2020/train.tsv", header=None, sep="\t" )
It seems like
header=None
doesn't work
hi Zorn,
Thanks for pointing this out. It indeed seems to be a bug and I pushed a bug to fix it.
It seems like kogito or source code of COMET use the old version of pytorch which is not support the Ampere architecture GPUs, the code can run on V100 which belongs to Volta architecture. But it still has some problem with
numpy==1.24.1
, it will got following errors:ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2026,) + inhomogeneous part.
Replace
numpy==1.24.1
it with older version likenumpy==1.21.2
can solve this problem but still have warrings like this:<__array_function__ internals>:5: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
Still, it works.
Regarding the other errors, unfortunately, since we do not actively maintain this repo, it might not support dependency updates. Please, try to use the original dependencies as much as possible and in case you can't and you find a solution, feel free to open a pull request.
It seems like kogito or source code of COMET use the old version of pytorch which is not support the Ampere architecture GPUs, the code can run on V100 which belongs to Volta architecture. But it still has some problem with
numpy==1.24.1
, it will got following errors:ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2026,) + inhomogeneous part.
Replace
numpy==1.24.1
it with older version likenumpy==1.21.2
can solve this problem but still have warrings like this:<__array_function__ internals>:5: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
Still, it works.
Regarding the other errors, unfortunately, since we do not actively maintain this repo, it might not support dependency updates. Please, try to use the original dependencies as much as possible and in case you can't and you find a solution, feel free to open a pull request.
Very grateful for your reply, btw, may i ask for a requirements with specific python version and package version list? Use pip install kogito
can't work directly, it will automatic install numpy==1.24.1
when python==3.10
It seems like kogito or source code of COMET use the old version of pytorch which is not support the Ampere architecture GPUs, the code can run on V100 which belongs to Volta architecture. But it still has some problem with
numpy==1.24.1
, it will got following errors:ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2026,) + inhomogeneous part.
Replace
numpy==1.24.1
it with older version likenumpy==1.21.2
can solve this problem but still have warrings like this:<__array_function__ internals>:5: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
Still, it works.
Regarding the other errors, unfortunately, since we do not actively maintain this repo, it might not support dependency updates. Please, try to use the original dependencies as much as possible and in case you can't and you find a solution, feel free to open a pull request.
Very grateful for your reply, btw, may i ask for a requirements with specific python version and package version list? Use
pip install kogito
can't work directly, it will automatic installnumpy==1.24.1
whenpython==3.10
I see, in that case it seems like we need to resolve the issue so that new numpy version is supported on this architecture. Could you maybe share the full stack trace of the error you get with numpy==1.24.1?
I see, in that case it seems like we need to resolve the issue so that new numpy version is supported on this architecture. Could you maybe share the full stack trace of the error you get with numpy==1.24.1?
I try two versions of python on my device, and unfortunately, the subject of this issue is occurred, here is the problem of run the scripts examples/train_comet_bart.py
:
Here is my hardware environment:
Ubuntu 20.04
Nvidia Tesla V100
Driver Version: 535.54.03
Cuda 12.2
conda create -n env python=3.10
and pip install kogito
to configure the environment.
The cuda versoon of torch which is automatically installed :
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.version.cuda
'11.7'
Here is the output when run pip list
:
absl-py 2.0.0
aiohttp 3.9.1
aiosignal 1.3.1
annotated-types 0.6.0
async-timeout 4.0.3
attrs 23.1.0
bert-score 0.3.13
blis 0.7.11
cachetools 5.3.2
catalogue 2.0.10
certifi 2023.11.17
charset-normalizer 3.3.2
click 8.1.7
cloudpathlib 0.16.0
colorama 0.4.6
confection 0.1.4
contourpy 1.2.0
cycler 0.12.1
cymem 2.0.8
docker-pycreds 0.4.0
et-xmlfile 1.1.0
filelock 3.13.1
fonttools 4.46.0
frozenlist 1.4.0
fsspec 2023.12.2
future 0.18.3
gitdb 4.0.11
GitPython 3.1.40
google-auth 2.25.2
google-auth-oauthlib 1.2.0
grpcio 1.51.3
huggingface-hub 0.19.4
idna 3.6
inflect 5.6.2
Jinja2 3.1.2
joblib 1.3.2
kiwisolver 1.4.5
kogito 0.6.2
langcodes 3.3.0
lightning-utilities 0.10.0
lxml 4.9.3
Markdown 3.5.1
MarkupSafe 2.1.3
matplotlib 3.8.2
multidict 6.0.4
murmurhash 1.0.10
nltk 3.8.1
numpy 1.26.2
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
oauthlib 3.2.2
openai 0.18.1
openpyxl 3.1.2
packaging 23.2
pandas 1.5.3
pandas-stubs 2.1.1.230928
pathtools 0.1.2
Pillow 10.1.0
pip 23.3.1
portalocker 2.8.2
preshed 3.0.9
promise 2.3
protobuf 3.20.3
psutil 5.9.6
pyasn1 0.5.1
pyasn1-modules 0.3.0
pydantic 2.5.2
pydantic_core 2.14.5
pyDeprecate 0.3.1
pyparsing 3.1.1
python-dateutil 2.8.2
pytorch-lightning 1.5.10
pytz 2023.3.post1
PyYAML 6.0.1
regex 2023.10.3
requests 2.31.0
requests-oauthlib 1.3.1
rouge-score 0.0.4
rsa 4.9
sacrebleu 2.4.0
safetensors 0.4.1
sentencepiece 0.1.99
sentry-sdk 1.39.1
setproctitle 1.3.3
setuptools 59.5.0
shortuuid 1.0.11
six 1.16.0
smart-open 6.4.0
smmap 5.0.1
spacy 3.7.2
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.4.8
tabulate 0.9.0
tensorboard 2.15.1
tensorboard-data-server 0.7.2
thinc 8.2.1
tokenizers 0.15.0
torch 1.13.1
torchmetrics 1.2.1
tqdm 4.66.1
transformers 4.36.1
typer 0.9.0
types-pytz 2023.3.1.1
typing_extensions 4.9.0
urllib3 2.1.0
wandb 0.12.21
wasabi 1.1.2
weasel 0.3.4
Werkzeug 3.0.1
wheel 0.41.2
yarl 1.9.4
And here is the full stack trace:
Global seed set to 42
/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:147: LightningDeprecationWarning: Setting `Trainer(checkpoint_callback=<pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint object at 0x7f1634d0ff40>)` is deprecated in v1.5 and will be removed in v1.7. Please consider using `Trainer(enable_checkpointing=<pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint object at 0x7f1634d0ff40>)`.
rank_zero_deprecation(
/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:167: LightningDeprecationWarning: Setting `Trainer(weights_summary=None)` is deprecated in v1.5 and will be removed in v1.7. Please set `Trainer(enable_model_summary=False)` instead.
rank_zero_deprecation(
GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1584: UserWarning: GPU available but not used. Set the gpus flag in your trainer `Trainer(gpus=1)` or script `--gpus=1`.
rank_zero_warn(
/root/miniconda3/envs/peak/lib/python3.10/site-packages/transformers/optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
Validation sanity check: 0it [00:00, ?it/s]/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/data_loading.py:132: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 52 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Validation sanity check: 0%| | 0/2 [00:00<?, ?it/s]/root/miniconda3/envs/peak/lib/python3.10/site-packages/transformers/generation/utils.py:1355: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
Global seed set to 42
/root/miniconda3/envs/peak/lib/python3.10/site-packages/kogito/models/bart/utils.py:490: UserWarning: All learning rates are 0
warnings.warn("All learning rates are 0")
/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/data_loading.py:132: UserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 52 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Epoch 0: 0%| | 0/3126 [00:00<?, ?it/s]Traceback (most recent call last):
File "/root/autodl-tmp/kogito-main/examples/train_comet_bart.py", line 39, in <module>
model.train(
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/kogito/models/bart/comet.py", line 109, in train
trainer: pl.Trainer = generic_train(
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/kogito/models/bart/utils.py", line 262, in generic_train
trainer.fit(model)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
self._call_and_handle_interrupt(
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
self._dispatch()
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
return self._run_train()
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
self.fit_loop.run()
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
self.epoch_loop.run(data_fetcher)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 193, in advance
batch_output = self.batch_loop.run(batch, batch_idx)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 215, in advance
result = self._run_optimization(
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 266, in _run_optimization
self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 378, in _optimizer_step
lightning_module.optimizer_step(
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/core/lightning.py", line 1652, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 164, in step
trainer.accelerator.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/accelerators/accelerator.py", line 339, in optimizer_step
self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 163, in optimizer_step
optimizer.step(closure=closure, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
return wrapped(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/torch/optim/optimizer.py", line 140, in wrapper
out = func(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/transformers/optimization.py", line 457, in step
loss = closure()
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 148, in _wrap_closure
closure_result = closure()
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 160, in __call__
self._result = self.closure(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 155, in closure
self._backward_fn(step_output.closure_loss)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 327, in backward_fn
self.trainer.accelerator.backward(loss, optimizer, opt_idx)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/accelerators/accelerator.py", line 314, in backward
self.precision_plugin.backward(self.lightning_module, closure_loss, *args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 91, in backward
model.backward(closure_loss, optimizer, *args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/pytorch_lightning/core/lightning.py", line 1434, in backward
loss.backward(*args, **kwargs)
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/torch/_tensor.py", line 488, in backward
torch.autograd.backward(
File "/root/miniconda3/envs/peak/lib/python3.10/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
conda create -n env python=3.8
and pip install kogito
to configure the environment.
The cuda versoon of torch which is automatically installed is also 11.7
Here is the output when run pip list
:
absl-py 2.0.0
aiohttp 3.9.1
aiosignal 1.3.1
annotated-types 0.6.0
async-timeout 4.0.3
attrs 23.1.0
bert-score 0.3.13
blis 0.7.11
cachetools 5.3.2
catalogue 2.0.10
certifi 2023.11.17
charset-normalizer 3.3.2
click 8.1.7
cloudpathlib 0.16.0
colorama 0.4.6
confection 0.1.4
contourpy 1.1.1
cycler 0.12.1
cymem 2.0.8
docker-pycreds 0.4.0
et-xmlfile 1.1.0
filelock 3.13.1
fonttools 4.46.0
frozenlist 1.4.0
fsspec 2023.12.2
future 0.18.3
gitdb 4.0.11
GitPython 3.1.40
google-auth 2.25.2
google-auth-oauthlib 1.0.0
grpcio 1.51.3
huggingface-hub 0.19.4
idna 3.6
importlib-metadata 7.0.0
importlib-resources 6.1.1
inflect 5.6.2
Jinja2 3.1.2
joblib 1.3.2
kiwisolver 1.4.5
kogito 0.6.2
langcodes 3.3.0
lightning-utilities 0.10.0
lxml 4.9.3
Markdown 3.5.1
MarkupSafe 2.1.3
matplotlib 3.7.4
multidict 6.0.4
murmurhash 1.0.10
nltk 3.8.1
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
oauthlib 3.2.2
openai 0.18.1
openpyxl 3.1.2
packaging 23.2
pandas 1.5.3
pandas-stubs 2.0.3.230814
pathtools 0.1.2
Pillow 10.1.0
pip 23.3.1
portalocker 2.8.2
preshed 3.0.9
promise 2.3
protobuf 3.20.3
psutil 5.9.6
pyasn1 0.5.1
pyasn1-modules 0.3.0
pydantic 2.5.2
pydantic_core 2.14.5
pyDeprecate 0.3.1
pyparsing 3.1.1
python-dateutil 2.8.2
pytorch-lightning 1.5.10
pytz 2023.3.post1
PyYAML 6.0.1
regex 2023.10.3
requests 2.31.0
requests-oauthlib 1.3.1
rouge-score 0.0.4
rsa 4.9
sacrebleu 2.4.0
safetensors 0.4.1
sentencepiece 0.1.99
sentry-sdk 1.39.1
setproctitle 1.3.3
setuptools 59.5.0
shortuuid 1.0.11
six 1.16.0
smart-open 6.4.0
smmap 5.0.1
spacy 3.7.2
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.4.8
tabulate 0.9.0
tensorboard 2.14.0
tensorboard-data-server 0.7.2
thinc 8.2.2
tokenizers 0.15.0
torch 1.13.1
torchmetrics 1.2.1
tqdm 4.66.1
transformers 4.36.1
typer 0.9.0
types-pytz 2023.3.1.1
typing_extensions 4.9.0
urllib3 2.1.0
wandb 0.12.21
wasabi 1.1.2
weasel 0.3.4
Werkzeug 3.0.1
wheel 0.41.2
yarl 1.9.4
zipp 3.17.0
And the problem is as same as the first situation
Actually, I'm trying to reproduce PeaCoK, follow the guide I recreate an environment use python=3.10
and install the requirements. This time when I use this environment to run examples/train_comet_bart.py
, the problem is like this:
Global seed set to 42
Traceback (most recent call last):
File "/root/autodl-tmp/kogito-main/examples/train_comet_bart.py", line 39, in <module>
model.train(
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/models/bart/comet.py", line 108, in train
trainer: pl.Trainer = generic_train(
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/models/bart/utils.py", line 249, in generic_train
trainer = pl.Trainer(
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 348, in insert_env_defaults
return fn(self, **kwargs)
TypeError: Trainer.__init__() got an unexpected keyword argument 'weights_summary'
Then I replace the pytorch-lightning==1.9.5
with pytorch-lightning==1.6.5
, this error which due to numpy==1.24.1
occurred:
Global seed set to 42
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:151: LightningDeprecationWarning: Setting `Trainer(checkpoint_callback=<pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint object at 0x7f13358ebee0>)` is deprecated in v1.5 and will be removed in v1.7. Please consider using `Trainer(enable_checkpointing=<pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint object at 0x7f13358ebee0>)`.
rank_zero_deprecation(
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:171: LightningDeprecationWarning: Setting `Trainer(weights_summary=None)` is deprecated in v1.5 and will be removed in v1.7. Please set `Trainer(enable_model_summary=False)` instead.
rank_zero_deprecation(
GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1814: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`.
rank_zero_warn(
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
Sanity Checking: 0it [00:00, ?it/s]/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:240: PossibleUserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 52 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Sanity Checking DataLoader 0: 0%| | 0/2 [00:00<?, ?it/s]/root/miniconda3/envs/kgto/lib/python3.10/site-packages/transformers/generation/utils.py:1387: UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 20 (`self.config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/models/bart/utils.py:487: UserWarning: All learning rates are 0
warnings.warn("All learning rates are 0")
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:240: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 52 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Epoch 0: 0%| | 0/3126 [00:00<?, ?it/s]Traceback (most recent call last):
File "/root/autodl-tmp/kogito-main/examples/train_comet_bart.py", line 39, in <module>
model.train(
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/models/bart/comet.py", line 108, in train
trainer: pl.Trainer = generic_train(
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/models/bart/utils.py", line 262, in generic_train
trainer.fit(model)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
self._call_and_handle_interrupt(
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
results = self._run_stage()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
return self._run_train()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1353, in _run_train
self.fit_loop.run()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 204, in run
self.advance(*args, **kwargs)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 266, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 199, in run
self.on_run_start(*args, **kwargs)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 153, in on_run_start
_ = iter(data_fetcher) # creates the iterator inside the fetcher
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/utilities/fetching.py", line 179, in __iter__
self._apply_patch()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/utilities/fetching.py", line 115, in _apply_patch
apply_to_collections(self.loaders, self.loader_iters, (Iterator, DataLoader), _apply_patch_fn)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/utilities/fetching.py", line 154, in loader_iters
loader_iters = self.dataloader_iter.loader_iters
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/supporters.py", line 545, in loader_iters
self._loader_iters = self.create_loader_iters(self.loaders)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/supporters.py", line 585, in create_loader_iters
return apply_to_collection(loaders, Iterable, iter, wrong_dtype=(Sequence, Mapping))
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/utilities/apply_func.py", line 99, in apply_to_collection
return function(data, *args, **kwargs)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 435, in __iter__
return self._get_iterator()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 381, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1072, in __init__
self._reset(loader, first_iter=True)
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1105, in _reset
self._try_put_index()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1339, in _try_put_index
index = self._next_index()
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 618, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/torch/utils/data/sampler.py", line 254, in __iter__
for idx in self.sampler:
File "/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/core/utils.py", line 125, in __iter__
np.concatenate(np.random.permutation(ck_idx[1:]))
File "mtrand.pyx", line 4703, in numpy.random.mtrand.RandomState.permutation
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3121,) + inhomogeneous part.
Epoch 0: 0%| | 0/3126 [00:01<?, ?it/s]
Replace numpy==1.24.1 it with older version like numpy==1.21.2 can solve this error, but with another warnings like this:
Global seed set to 42
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:151: LightningDeprecationWarning: Setting `Trainer(checkpoint_callback=<pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint object at 0x7f5b436913c0>)` is deprecated in v1.5 and will be removed in v1.7. Please consider using `Trainer(enable_checkpointing=<pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint object at 0x7f5b436913c0>)`.
rank_zero_deprecation(
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:171: LightningDeprecationWarning: Setting `Trainer(weights_summary=None)` is deprecated in v1.5 and will be removed in v1.7. Please set `Trainer(enable_model_summary=False)` instead.
rank_zero_deprecation(
GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1814: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`.
rank_zero_warn(
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
Sanity Checking: 0it [00:00, ?it/s]/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:240: PossibleUserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 52 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Sanity Checking DataLoader 0: 0%| | 0/2 [00:00<?, ?it/s]/root/miniconda3/envs/kgto/lib/python3.10/site-packages/transformers/generation/utils.py:1387: UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 20 (`self.config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/models/bart/utils.py:487: UserWarning: All learning rates are 0
warnings.warn("All learning rates are 0")
/root/miniconda3/envs/kgto/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:240: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 52 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Epoch 0: 0%| | 0/3126 [00:00<?, ?it/s]/root/miniconda3/envs/kgto/lib/python3.10/site-packages/kogito/core/utils.py:125: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
np.concatenate(np.random.permutation(ck_idx[1:]))
<__array_function__ internals>:5: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
It seems like the problem is about the conflict between pytorch-lightning
used numpy
version and the repo used numpy
version.
The cuda versoon of torch which is automatically is 11.7
Hi, thanks for the fantastic work. I have a problem here. I want to train COMET on my own dataset, and I follow the user guide from here docs. My codes are as follows:
However, i got following errors when i run the code:
Here is my environment:
I do not kown why, any reply will be appreciated. Thanks.