Closed theashworld closed 3 years ago
And I tried with roberta-base also, same issue
Evaluating: 53%|███████████████████████████████████▊ | 25819/49088 [25:25<22:55, 16.92it/s]
Traceback (most recent call last):
File "cli.py", line 282, in <module>
main()
File "cli.py", line 263, in main
no_distillation=args.no_distillation, seed=args.seed)
File "/home/qblocks/shan/pet/pet/modeling.py", line 249, in train_pet
save_unlabeled_logits=not no_distillation, seed=seed)
File "/home/qblocks/shan/pet/pet/modeling.py", line 355, in train_pet_ensemble
unlabeled_data=unlabeled_data))
File "/home/qblocks/shan/pet/pet/modeling.py", line 434, in train_single_model
results_dict['train_set_before_training'] = evaluate(model, train_data, eval_config)['scores']['acc']
File "/home/qblocks/shan/pet/pet/modeling.py", line 490, in evaluate
n_gpu=config.n_gpu, decoding_strategy=config.decoding_strategy, priming=config.priming)
File "/home/qblocks/shan/pet/pet/wrapper.py", line 376, in eval
logits = EVALUATION_STEP_FUNCTIONS[self.config.wrapper_type](self)(batch)
File "/home/qblocks/shan/pet/pet/wrapper.py", line 525, in mlm_eval_step
return self.preprocessor.pvp.convert_mlm_logits_to_cls_logits(batch['mlm_labels'], outputs[0])
File "/home/qblocks/shan/pet/pet/pvp.py", line 207, in convert_mlm_logits_to_cls_logits
masked_logits = logits[mlm_labels >= 0]
RuntimeError: copy_if failed to synchronize: cudaErrorLaunchFailure: unspecified launch failure
Segmentation fault (core dumped)
Hi @theashworld, I'm on vacation this week but I'll take a look at this issue early next week.
Hi @theashworld, I was unable to reproduce this issue so far. Could you try whether it works with a smaller training set, e.g. as follows:
python3 cli.py --method pet --pattern_ids 0 1 2 3 --data_dir MNLI/ --model_type roberta --model_name_or_path roberta-large --task_name mnli --output_dir out2 --do_train --do_eval --train_examples 100 --unlabeled_examples 30000 --split_examples_evenly
Welcome back from the vacation. Looks like the machine had issues, I tried on another machine and it seems to be progressing fine. Closing for now, will reopen if needed. Thanks!
fresh dir, syncd the repo, did the pip install from requirements.txt
command line
Error: