nyuolab / NYUTron

public code repository for paper "Health system scale language models are general purpose clinical prediction engines"
Other
108 stars 15 forks source link

Run "python 4_finetune.py" and "config.json" cannot be found #2

Open LiveJerry opened 7 months ago

LiveJerry commented 7 months ago

This model makes a lot of sense, so I want to see it in action. But when I run "python 4_finetune.py", "OSError: Can't load config for 'data/pretrain_ckpt/toy_example/checkpoint-1'.... "Error, I look at the directory does not have this file, please ask how to solve?"

The full output is as follows: {'data': {'tokenized_data_path': 'data/finetune/toy_readmission/tokenized', 'num_label': 2, 'truncation': True, 'is_split_into_words': False, 'max_length': 512, 'num_train_samples': 10, 'num_eval_samples': 10, 'tokenizer': {'path': 'data/pretrain/tokenizer_small_synthetic_clinical'}}, 'model': {'pretrained': 'synthetic_toy', 'path': 'data/pretrain_ckpt/toy_example/checkpoint-1'}, 'trainer': {'p_eval': 0.5, 'lr': 2e-05, 'num_train_epochs': 2, 'weight_decay': 0, 'save_strategy': 'steps', 'logging_strategy': 'steps', 'logging_steps': 5, 'eval_steps': 50, 'evaluation_strategy': 'steps', 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'save_steps': 50, 'save_total_limit': 5, 'metric': ['roc_auc'], 'early_stop': True, 'gradient_accumulation_steps': 4}, 'logger': {'report_to': 'wandb', 'output_dir': 'data/finetune/logs/toy_readmission', 'project': 'toy_readmission', 'run_name': None, 'run_id': None, 'save_dir': None}, 'slurm': {}, 'run': {'seed': 0, 'debug': False}} Error executing job with overrides: [] Traceback (most recent call last): File "/root/anaconda3/envs/nyutron/lib/python3.8/site-packages/transformers/configuration_utils.py", line 601, in _get_config_dict resolved_config_file = cached_path( File "/root/anaconda3/envs/nyutron/lib/python3.8/site-packages/transformers/utils/hub.py", line 297, in cached_path raise EnvironmentError(f"file {url_or_filename} not found") OSError: file data/pretrain_ckpt/toy_example/checkpoint-1/config.json not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "4_finetune.py", line 169, in finetune model = AutoModelForSequenceClassification.from_pretrained( File "/root/anaconda3/envs/nyutron/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 423, in from_pretrained config, kwargs = AutoConfig.from_pretrained( File "/root/anaconda3/envs/nyutron/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 680, in from_pretrained configdict, = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) File "/root/anaconda3/envs/nyutron/lib/python3.8/site-packages/transformers/configuration_utils.py", line 553, in get_config_dict config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, kwargs) File "/root/anaconda3/envs/nyutron/lib/python3.8/site-packages/transformers/configuration_utils.py", line 641, in _get_config_dict raise EnvironmentError( OSError: Can't load config for 'data/pretrain_ckpt/toy_example/checkpoint-1'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'data/pretrain_ckpt/toy_example/checkpoint-1' is the correct path to a directory containing a config.json file

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

quoccuong75 commented 1 month ago

Hi, I also encountered this error OSError: Can't load config for 'data/pretrain_ckpt/toy_example/checkpoint-1' when running the 4_finetune.py file. I read your post, but I'm still not sure how to fix this error. Where can I find and download the config.json file? And do I have to change any code to fix the path to the config.json file? I would love to hear from you.