First, Thank you for your great work on the task. I could get many insights from this project.
I'm Just wondering
is genre-kilt model in huggingface differ from model in this repository? if so how they are different?
I have my custom document retrieval dataset in kilt style. How can I finetune with hugging face model? I just want to transport with trainer api in huggingface. Can you give me a guide?
I also tried finetuning with this script
mine is at below
#!/bin/bash
Copyright (c) Facebook, Inc. and its affiliates.
All rights reserved.
#
This source code is licensed under the license found in the
LICENSE file in the root directory of this source tree.
But I found that the training loss is decreasing while evaluating loss is increasing.
I used Natural question kilt train and dev dataset. and Is this because of overfitting?
Thank you for your effort on this project again.
Thank you very much
Hello.
First, Thank you for your great work on the task. I could get many insights from this project. I'm Just wondering
is genre-kilt model in huggingface differ from model in this repository? if so how they are different?
I have my custom document retrieval dataset in kilt style. How can I finetune with hugging face model? I just want to transport with
trainer
api in huggingface. Can you give me a guide?I also tried finetuning with this script mine is at below
Copyright (c) Facebook, Inc. and its affiliates.
All rights reserved.
#
This source code is licensed under the license found in the
LICENSE file in the root directory of this source tree.
DATASET=$1
NAME=$2
DATASET=/userhomes/sangryul/project/contrastive-retrieval/GENRE/data_fair BASED_MODEL=/userhomes/sangryul/project/contrastive-retrieval/GENRE/models/fairseq_wikipage_retrieval NAME=nq_100_finetune STEP=10000
fairseq-train $DATASET/bin/ \ --wandb-project multiperspective \ --no-epoch-checkpoints \ --keep-best-checkpoints 1 \ --save-dir /userhomes/sangryul/project/contrastive-retrieval/GENRE/models/$NAME \ --restore-file $BASED_MODEL/model.pt \ --arch bart_large \ --task translation \ --criterion label_smoothed_cross_entropy \ --source-lang source \ --target-lang target \ --truncate-source \ --label-smoothing 0.1 \ --max-tokens 1024 \ --update-freq 1 \ --max-update $STEP \ --required-batch-size-multiple 1 \ --dropout 0.1 \ --attention-dropout 0.1 \ --relu-dropout 0.0 \ --weight-decay 0.01 \ --optimizer adam \ --adam-betas "(0.9, 0.999)" \ --adam-eps 1e-08 \ --clip-norm 0.1 \ --lr-scheduler polynomial_decay \ --lr 3e-05 \ --total-num-update $STEP \ --warmup-updates 500 \ --num-workers 20 \ --share-all-embeddings \ --layernorm-embedding \ --share-decoder-input-output-embed \ --skip-invalid-size-inputs-valid-test \ --log-format json \ --log-interval 10 \ --patience 200 \