training - Githubissues

megagonlabs / ditto

Code for the paper "Deep Entity Matching with Pre-trained Language Models"

Apache License 2.0

259 stars 89 forks source link

training #9

Open cvsekhar opened 4 years ago

cvsekhar commented 4 years ago

I was trying to execute the training code on a cpu. With the following hyperparemeters. python train_ditto.py \ --task Structured/Beer \ --batch_size 64 \ --max_len 64 \ --lr 3e-5 \ --n_epochs 5 \ --finetuning \ --lm distilbert \ --da del \ --dk product \ --save_model \ --summarize

I think some how the dev_f1 score is zero and the accuracy stuck at 0.84 and the epoch is not increased and its going in loops since due to the while loop in mixda is epoch <= hp.n_epochs.

is there something I am missing or is it going in an infinite loop

cvsekhar commented 4 years ago

oi02lyl commented 4 years ago

I see. Will fix the bug of infinite loops. The hyper-parameters are not ideal for this dataset either (I will change the README with an updated set). Meanwhile, you can try this one

CUDA_VISIBLE_DEVICES=0 python train_ditto.py \
  --task Structured/Beer \
  --batch_size 32 \
  --max_len 128 \
  --lr 3e-5 \
  --n_epochs 40 \
  --finetuning \
  --lm roberta \
  --fp16 \
  --da drop_col

which should work better.

cvsekhar commented 4 years ago

I have this same error before as well, forgot to capture this

/data/home/vijaya.chennupati/.conda/envs/txtclass/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1221: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use zero_division parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) accuracy=0.846 precision=0.000 recall=0.000 f1=0.000

cvsekhar commented 4 years ago

I am using CPU ... so removed the parameter --fp16 , no luck runs into infinite loop

oi02lyl commented 4 years ago

I see. I tried the CPU version and got the same error. It seems to be related to data augmentation but I am not sure. I tried the baseline version (no DA) and it works fine on CPU:

CUDA_VISIBLE_DEVICES= python train_ditto.py   --task Structured/Beer   --batch_size 32   --max_len 128   --lr 3e-5   --n_epochs 40   --finetuning   --lm distilbert

You might also use this colab notebook to run it on GPUs.

cvsekhar commented 4 years ago

Will give it a try.

Freedomeri commented 2 years ago

Hello, does the latest version fixed this bug? I also have this problem when I use CPU to train my program...