Closed swarada96 closed 1 year ago
Hello @swarada96. Thanks for your feedback.
The problem here is that the config property val_batch_size
is not present in your config file. You can add it there to fix this. But I pushed some new changes in order to fix this. Update the repo and it should work fine.
I updated the repo as per your suggestion and the it made a difference. Thank you for that. But, after using the updated file, my CUDA runs out of memory. What changes can you suggest me. Thanking you in advance.
PS F:\data2vec-pytorch-main> python train.py --config text/configs/roberta-pretraining.yaml Found cached dataset wikitext (C:/Users/User/.cache/huggingface/datasets/wikitext/wikitext-103-v1/1.0.0/a241db52902eaf2c6aa732 210bead40c090019a499ceb13bcbfa3f8ab646a126) 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00, 1.05it/s] Found cached dataset wikitext (C:/Users/User/.cache/huggingface/datasets/wikitext/wikitext-103-v1/1.0.0/a241db52902eaf2c6aa732 210bead40c090019a499ceb13bcbfa3f8ab646a126) 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 50.40it/s]
Epoch: 1/20 0%| | 0/56293 [00:00<?, ?batch/s]You're
using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__
method is faster than us
ing a method to encode the text followed by a call to the pad
method to get a padded encoding.
Epoch: 1/20 0%| | 0/56293 [00:09<?, ?batch/s]
Traceback (most recent call last):
File "F:\data2vec-pytorch-main\train.py", line 25, in
@swarada96 You've got 4GB in total, It'd be better to set a lower batch_size
Hello Aryan,
Do we have to create a file named dummy_data in order to save the split data ? I am getting the following error for vision encoding.
python train.py --config vision/configs/beit-pretraining.yaml
Traceback (most recent call last):
File "/home1/08351/sak3951/Work/data2vec-pytorch/train.py", line 24, in
Hello again @swarada96, sorry for the delay.
The dummy_data
is a random folder name containing all the image files. You can define or create your own directory of images.
Traceback (most recent call last): File "F:\study\UTA_PhD\Papers\data2vec-pytorch-main\train.py", line 24, in
trainer = trainers_dictmodality
File "F:\study\UTA_PhD\Papers\data2vec-pytorch-main\text\trainer.py", line 55, in init
self.test_loader = DataLoader(self.test_dataset, batch_size=cfg.train.val_batch_size,
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf\dictconfig.py", line 355, in getattr
self._format_and_raise(
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf\base.py", line 231, in _format_and_raise
format_and_raise(
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf_utils.py", line 899, in format_and_raise
_raise(ex, cause)
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf_utils.py", line 797, in _raise
raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf\dictconfig.py", line 351, in getattr
return self._get_impl(
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf\dictconfig.py", line 442, in _get_impl
node = self._get_child(
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf\basecontainer.py", line 73, in _get_child
child = self._get_node(
File "C:\Users\User\anaconda3\lib\site-packages\omegaconf\dictconfig.py", line 480, in _get_node
raise ConfigKeyError(f"Missing key {key!s}")
omegaconf.errors.ConfigAttributeError: Missing key val_batch_size
full_key: train.val_batch_size
object_type=dict
Can you please help me with the omegaconf. What package version is used while training the datasets?