seujung / KoBART-summarization

Summarization module based on KoBART
MIT License
196 stars 89 forks source link

import error에 관한 도움을 받고싶습니다. #31

Open Shaerit opened 2 years ago

Shaerit commented 2 years ago

안녕하세요. KoBART-summarization을 구동시켜보려고 하는데 계속 ImportError가 뜨면서 실행을 못 시켜 보고 있습니다. 이 에러에 관해서 도움을 받고 싶습니다.

해결했습니다 :)

seujung commented 2 years ago

@Shaerit Error 가 발생한 log를 봐야 comment를 드릴 수 있을 것 같습니다.

Shaerit commented 2 years ago

@seujung 안녕하세요. 답장주셔서 감사합니다. import error는 해결되었지만 AttributeError: 'Trainer' object has no attribute '_data_connector'라는 에러가 발생하고 있습니다.

(bart) C:\Users\whtnq\Documents\KoBART-summarization>python train.py --gradient_clip_val 1.0 --max_epochs 50 --default_root_dir logs --gpus 1 --batch_size 4 --num_workers 4 INFO:root:Namespace(accelerator=None, accumulate_grad_batches=1, amp_backend='native', amp_level='O2', auto_lr_find=False, auto_scale_batch_size=False, auto_select_gpus=False, batch_size=4, benchmark=False, check_val_every_n_epoch=1, checkpoint_callback=True, checkpoint_path=None, default_root_dir='logs', deterministic=False, distributed_backend=None, fast_dev_run=False, flush_logs_every_n_steps=100, gpus=1, gradient_clip_algorithm='norm', gradient_clip_val=1.0, ipus=None, limit_predict_batches=1.0, limit_test_batches=1.0, limit_train_batches=1.0, limit_val_batches=1.0, log_every_n_steps=50, log_gpu_memory=None, logger=True, lr=3e-05, max_epochs=50, max_len=512, max_steps=None, max_time=None, min_epochs=None, min_steps=None, model_path=None, move_metrics_to_cpu=False, multiple_trainloader_mode='max_size_cycle', num_nodes=1, num_processes=1, num_sanity_val_steps=2, num_workers=4, overfit_batches=0.0, plugins=None, precision=32, prepare_data_per_node=True, process_position=0, profiler=None, progress_bar_refresh_rate=None, reload_dataloaders_every_epoch=False, reload_dataloaders_every_n_epochs=0, replace_sampler_ddp=True, resume_from_checkpoint=None, stochastic_weight_avg=False, sync_batchnorm=False, terminate_on_nan=False, test_file='data/test.tsv', tpu_cores=None, track_grad_norm=-1, train_file='data/train.tsv', truncated_bptt_steps=None, val_check_interval=1.0, warmup_ratio=0.1, weights_save_path=None, weights_summary='top') GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] Traceback (most recent call last): File "train.py", line 186, in trainer.fit(model, dm) File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 522, in fit self._run(model) File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 839, in _run self.accelerator.setup(self, model) # note: this sets up self.lightning_module File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\accelerators\gpu.py", line 42, in setup return super().setup(trainer, model) File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 93, in setup self.setup_optimizers(trainer) File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 347, in setup_optimizers optimizers, lr_schedulers, optimizer_frequencies = self.training_type_plugin.init_optimizers( File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 223, in init_optimizers return trainer.init_optimizers(model) File "C:\Users\whtnq\anaconda3\envs\bart\lib\site-packages\pytorch_lightning\trainer\optimizers.py", line 34, in init_optimizers optim_conf = model.configure_optimizers() File "train.py", line 101, in configure_optimizers data_len = self.setup_steps(self) File "train.py", line 84, in setup_steps train_loader = self.trainer._data_connector._train_dataloader_source.dataloader() AttributeError: 'Trainer' object has no attribute '_data_connector'

deliciouscat commented 1 year ago

pytorch lightning 2.0버전 넘어오면서 _data_connector가 없어졌다고 합니다. 저는 1.9.5버전으로 다운그레이드해서 진행중인데, _RuntimeError: element 0 of tensors does not require grad and does not have a gradfn 오류가 발생하네요:..(

Shaerit commented 1 year ago

그렇군요... 알려주셔서 감사합니다 :)

deliciouscat commented 1 year ago

다음과 같이 설정하니 저는 정상적으로 실행이 됩니다. 코드 수정이 있었는가는 기억이 안나네요:)

python 3.9.2 CUDA 11.2

pandas torch==1.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html transformers==4.8.2 pytorch-lightning==1.9.4 torchmetrics==1.1.1 streamlit==1.1.0 fastparquet fastapi numpy==1.20.3