AI4Bharat / IndicXlit

Transliteration models for 21 Indic languages
https://ai4bharat.iitm.ac.in/transliteration
MIT License
76 stars 21 forks source link

Unable to run the fine tuning model code for Indic-English (it works for English to Indic) #18

Open Gautam-Rajeev opened 1 year ago

Gautam-Rajeev commented 1 year ago

We were running into the following error while trying to run the fine-tuning code for Indic-English

    size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([54, 256]) from checkpoint, the shape in current model is torch.Size([34, 256]).
    size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([54, 256]) from checkpoint, the shape in current model is torch.Size([34, 256]).

I have linked the modified notebook to carry that out.

The notebook works for English-Indic fine tuning.

GokulNC-Sarvam commented 7 months ago

It seems to be working for me. (I was finetuning it for a new language)

Config:

export OPENBLAS_NUM_THREADS=1
export NUMEXPR_MAX_THREADS=1
export CUDA_VISIBLE_DEVICES=0

fairseq-train /home/gokul/IndicXlit/data/romanization/corpus-bin \
    --save-dir checkpoints/bho-rom \
    --arch transformer --layernorm-embedding \
    --task translation_multi_simple_epoch \
    --sampling-method "temperature" \
    --sampling-temperature 1.5 \
    --encoder-langtok "src" \
    --lang-dict /home/gokul/IndicXlit/app/ai4bharat/transliteration/transformer/models/indic2en/lang_list_new.txt \
    --lang-pairs bho-en \
    --decoder-normalize-before --encoder-normalize-before \
    --activation-fn gelu --adam-betas "(0.9, 0.98)"  \
    --batch-size 512 \
    --decoder-attention-heads 4 --decoder-embed-dim 256 --decoder-ffn-embed-dim 1024 --decoder-layers 6 \
    --dropout 0.5 \
    --encoder-attention-heads 4 --encoder-embed-dim 256 --encoder-ffn-embed-dim 1024 --encoder-layers 6 \
    --lr 0.00003 --lr-scheduler inverse_sqrt \
    --max-epoch 40 \
    --optimizer adam  \
    --num-workers 0 \
    --warmup-init-lr 0 --warmup-updates 200 \
    --skip-invalid-size-inputs-valid-test \
    --keep-last-epochs 5 \
    --save-interval 5 \
    --keep-best-checkpoints 1 \
    --distributed-world-size 1 \
    --patience 10 \
    --restore-file /home/gokul/IndicXlit/checkpoints/indicxlit-indic-en-v1.0/transformer/indicxlit.pt \
    --reset-lr-scheduler \
    --reset-meters \
    --reset-dataloader \
    --reset-optimizer
ahsanalidev commented 2 months ago

I am also getting same error. Please help

ahsanalidev commented 2 months ago

@GautamR-Samagra Were you able to find solution?