The problem in run unsup_example.sh

Linda230 commented 2 years ago

Hi, I encounter this issue when I run the file "run_unsup_example.sh" train the model. It has been in this status for about 20 hours, so I would like to ask the total training time for unsupervised simcse model?Or is there something wrong with my code or device.

bash run_unsup_example.sh 05/31/2022 20:10:03 - INFO - main - PyTorch: setting up devices 05/31/2022 20:10:03 - WARNING - main - Process rank: -1, device: cuda:0, n_gpu: 1 distributed training: F alse, 16-bits training: True 05/31/2022 20:10:03 - INFO - main - Training/evaluation parameters OurTrainingArguments(output_dir='resul t/ConSERT', overwrite_output_dir=True, do_train=True, do_eval=True, do_predict=False, evaluation_strategy=<Eval uationStrategy.STEPS: 'steps'>, prediction_loss_only=False, per_device_train_batch_size=64, per_device_eval_bat ch_size=8, per_gpu_train_batch_size=None, per_gpu_eval_batch_size=None, gradient_accumulation_steps=1, eval_accumulation_steps=None, learning_rate=3e-05, weight_decay=0.0, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, max_grad_norm=1.0, num_train_epochs=1.0, max_steps=-1, lr_scheduler_type=<SchedulerType.LINEAR: 'linear'>, warmup_steps=0, logging_dir='runs\May31_20-10-03_DESKTOP-5IA40JT', logging_first_step=False, logging_steps=500, save _steps=500, save_total_limit=None, no_cuda=False, seed=42, fp16=True, fp16_opt_level='O1', fp16_backend='auto', local_rank=-1, tpu_num_cores=None, tpu_metrics_debug=False, debug=False, dataloader_drop_last=False, eval_steps=2 00, dataloader_num_workers=0, past_index=-1, run_name='result/ConSERT', disable_tqdm=False, remove_unused_columns=True, label_names=None, load_best_model_at_end=True, metric_for_best_model='stsb_spearman', greater_is_better=T rue, ignore_data_skip=False, sharded_ddp=False, deepspeed=None, label_smoothing_factor=0.0, adafactor=False, eval_transfer=False) Using custom data configuration default

Linda230 commented 2 years ago

I have solved this problem.

zhaoqinggang commented 1 year ago

How did you solve it? I have also encountered a similar problem.

Linda230 commented 1 year ago

Hi, I just change the code "datasets = load_dataset(extension, data_files=data_files, cache_dir = "./data/")" in the line of 311 in the train.py into the following code: "datasets = load_dataset(extension, data_files=data_files)" and it works now. Hope you work well!

sadimanna commented 12 months ago

It works fine on Ubuntu for me. But on Windows removing '.' from 'cache_dir = ./data/' does the job for me.

princeton-nlp / SimCSE

The problem in run unsup_example.sh #178