Georgetown-IR-Lab / cedr

Code for CEDR: Contextualized Embeddings for Document Ranking, accepted at SIGIR 2019.
MIT License
156 stars 28 forks source link

Train cedrpacrr using 5 folds BERT checkpoint, train_pairs and valid_run #41

Open Pourbahman opened 2 years ago

Pourbahman commented 2 years ago

Hi Sean,

I want to train cedrpacrr using BERT checkpoint by the following command:

python train.py \
  --model cedr_pacrr \ # or cedr_knrm / cedr_drmm
  --datafiles data/queries.tsv data/documents.tsv \
  --qrels data/qrels \
  --train_pairs data/train_pairs \
  --valid_run data/valid_run \
  --initial_bert_weights models/vbert/weights.p \
  --model_out_dir models/cedrpacrr

As you know in your data directory train and validation data are in 5 folds. Also, weight of BERT check point has 5 folds.

Would you please guide me what I should do with train_pair, valid_run and weights?

Thanks in advance, Kind Regards, Zahra

yiyaxiaozhi commented 2 years ago

I used the following command to train a CEDR-KNRM model with fold 1 data on Robust 04 dataset:

python train.py   --model cedr_knrm  
--datafiles ../data/robust/queries.tsv ../data/robust/documents.tsv
--qrels ../data/robust/qrels   
--train_pairs ../data/robust/f1.train.pairs   
--valid_run ../data/robust/f1.valid.run   
--model_out_dir models/vbert1/cedr_knrm
--initial_bert_weights /models/cedr-models/vbert-robust-f1.p

where train_pairs value was the path of fold 1 training data and inital_bert_weights should also match the trained Vanilla_bert with fold 1 data.

Pourbahman commented 2 years ago

Thank you!

  1. So, you evaluated average of values of each metric for 5 folds manually, am I right?

  2. Would you please tell me whether you freezed the BERT layer when you trained the model? In other words, you make trainable of the BERT layer False? If yes, how did you do?

  3. Also, would you please tell me whether you got the result of the paper? If yes, did you change any parts or any configs of parameters of the implementation on the repository?

  4. Also, would you please tell me what was the configuration of the hardware you trained the model on it?

Thanks in advance, Kind Regards

seanmacavaney commented 2 years ago

Hi @Pourbahman,

I recommend using a package like OpenNIR or Capreolus. This repository was meant to be as a simplification/demonstration of the main idea, rather than a comprehensive system for doing these types of experiments / tuning / etc. The original experiments in the paper were conducted on a precursor to OpenNIR.

To answer your questions:

Pourbahman commented 2 years ago

Hi Sean,

Thanks for your complete answer :)

Kind Regards, Zahra