clinicalml / TabLLM

MIT License
271 stars 43 forks source link

AttributeError: Can't pickle local object 'get_linear_schedule_with_warmup.<locals>.lr_lambda' #22

Closed abdd68 closed 2 months ago

abdd68 commented 5 months ago

Traceback (most recent call last): File "/opt/conda/envs/tfew/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/envs/tfew/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/codespace/TABLLM/t-few/src/pl_train.py", line 86, in main(config) File "/codespace/TABLLM/t-few/src/pl_train.py", line 57, in main trainer.fit(model, datamodule) File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 741, in fit self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run self._dispatch() File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch self.training_type_plugin.start_training(self) File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 173, in start_training self.spawn(self.new_process, trainer, self.mp_queue, return_result=False) File "/opt/conda/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 201, in spawn mp.spawn(self._wrapped_function, args=(function, args, kwargs, return_queue), nprocs=self.num_processes) File "/opt/conda/envs/tfew/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/opt/conda/envs/tfew/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 179, in start_processes process.start() File "/opt/conda/envs/tfew/lib/python3.7/multiprocessing/process.py", line 112, in start self._popen = self._Popen(self) File "/opt/conda/envs/tfew/lib/python3.7/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/opt/conda/envs/tfew/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/opt/conda/envs/tfew/lib/python3.7/multiprocessing/popen_fork.py", line 20, in init self._launch(process_obj) File "/opt/conda/envs/tfew/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/opt/conda/envs/tfew/lib/python3.7/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'get_linear_schedule_with_warmup..lr_lambda'

abdd68 commented 5 months ago

Hi @stefanhgm , I was running on t-few and encountered this problem. It is like a pickle issue, when I replace pickle with dill, it doesn't work. Is there a way to solve it? Thank you!

stefanhgm commented 4 months ago

Hello @abdd68,

Thanks for using TabLLM!

Did you follow all steps in Readme to setup both TabLLM and t-few? What command did you execute that caused this error?

Unfortunately, I did not encounter this error yet, so it is hard to provide useful feedback.

yitongshang2021 commented 2 months ago

Hi, @abdd68! Did your solve the problem, I have a bug like the first part of you.

Traceback (most recent call last): File "C:\Users\78166\Anaconda3\envs\tfew\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "C:\Users\78166\Anaconda3\envs\tfew\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\78166\t-few\src\pl_train.py", line 89, in main(config) File "C:\Users\78166\t-few\src\pl_train.py", line 60, in main trainer.fit(model, datamodule) File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 741, in fit self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt return trainer_fn(*args, *kwargs) File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run self._dispatch() File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch self.training_type_plugin.start_training(self) File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training self._results = trainer.run_stage() File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage return self._run_train() File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1306, in _run_train self._pre_training_routine() File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1301, in _pre_training_routine self.call_hook("on_pretrain_routine_start") File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1495, in call_hook callback_fx(args, **kwargs) File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\trainer\callback_hook.py", line 148, in on_pretrain_routine_start callback.on_pretrain_routine_start(self, self.lightning_module) File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\callbacks\model_summary.py", line 57, in on_pretrain_routine_start summary_data = model_summary._get_summary_data() File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\utilities\model_summary.py", line 316, in _get_summary_data ("Params", list(map(get_human_readable_count, self.param_nums))), File "C:\Users\78166\Anaconda3\envs\tfew\lib\site-packages\pytorch_lightning\utilities\model_summary.py", line 419, in get_human_readable_count assert number >= 0

abdd68 commented 2 months ago

The code is based on the web environment, which I am also not familiar with, I remember that I directly use python code (generated by chatgpt) to generate the tokens of the tabular data from the TabLLM paper, you can follow the instructions from the paper to directly generate that.

yitongshang2021 commented 2 months ago

Thanks for your message @abdd68 , have a nice day~