junrong1 / sentiment

0 stars 0 forks source link

what specific files were used in arguments #1

Open jk2809 opened 4 years ago

jk2809 commented 4 years ago

Hey Junrong,

I was trying to run some of your files recently and I am struggling quite a lot to suppress the errors I am getting.

I have downloaded albert_base_v2 from https://github.com/lonePatient/albert_pytorch and I'm running something along the lines

export BERT_BASE_DIR=/mnt/c/Users/user/Documents/NLP/bert_models/albert_base_v2 export GLUE_DIR=/mnt/c/Users/user/Documents/NLP/sentiment

python3 run_classifier.py \ --task_name=travel_experience \ --vocab_file=$BERT_BASE_DIR/30k-clean.vocab \ --bert_config_file=$BERT_BASE_DIR/config.json \ --init_checkpoint=$BERT_BASE_DIR/pytorch_model.bin \ --output_dir=$GLUE_DIR/results_new \ --data_dir=$GLUE_DIR/data/travel_experience_quick

First issue is that the vocab_file throws an error that dictionary keys are mismatched. If i hack around this issue with different vocab_file I start getting the following errors:

File "run_classifier.py", line 369, in main model.bert.load_state_dict(torch.load(args.init_checkpoint, map_location='cpu')) File "/home/jk2809/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for BertModel: Missing key(s) in state_dict

Followed by a large list of dictionary keys.

Can you point me in the directions of the right files used to get run_classifier.py working?

best Jakub

junrong1 commented 4 years ago

Sorry for late updating

Here is my configuration.

--model_type albert --model_name_or_path albert_base_v2 --task_name cola --data_dir dataset/cola --max_seq_length 128 --per_gpu_train_batch_size 1 --learning_rate 5e-5 --num_train_epochs 1.0 --logging_steps 134 --save_steps 134 --seed 42 --spm_model_file ./prev_trained_model/albert_base_v2/30k-clean.model

jk2809 commented 4 years ago

Hey Junrong,

Thanks for this. Are those arguments to run_classifier.py file? Looking through that file it seems that few of them are unrecognised arguments (for example model_type, model_name_or_path, logging_steps etc). In addition the run_classifier.py has few required arguments which were not included in the list (like vocab_file, init_checkpoint and bert_config_file). Could you double check if run_classifier.py is the correct file?

Best Jakub

junrong1 commented 4 years ago

I found some difference between my code with the github code, could you give me your email, and I can send you directly. Because we cannot set the configuration on the Google Platform, we assign parameters directly such as model_name_or_path. You can also check the running code we upload on the Google Platform.

jk2809 commented 4 years ago

Thanks a lot. Feel free to send it to jakub.krol2809@gmail.com.

In terms of assigning parameters directly, thats fine. I just was not able to find the same the same parameters using your github code .

junrong1 commented 4 years ago

please check

jk2809 commented 4 years ago

Thanks, that seemed to work as in I have managed to run a test. However, as far as I understand the new set of files do not have the travel_experience test case, for example glue.py does not have preprocessing class?

Do you have a complete set of files (i.e. both which combine the travel_experience testcase together with the version you sent me). I can spend some time combining the files you have sent with this github repo, but it would save my time having if you have something complete prepared already. Thanks

saqbach commented 7 months ago

This is a spam phishing attack. Reported to GitHub Abuse team.