yzhangcs / parser

:rocket: State-of-the-art parsers for natural language.
https://parser.yzhang.site/
MIT License
837 stars 143 forks source link

constituency parser training hangs on 'building the model' step #144

Open Wolfingten opened 4 days ago

Wolfingten commented 4 days ago

When training a custom constituency parser model the training starts, but hangs on "building the model" step for several hours/days.

Steps taken

Install current version with pip install git+https://github.com/yzhangcs/parser and start training the constituency parser with:

python -u -m supar.cmds.const.crf train -b -d 0 -c con-crf-roberta-en -p model  \
    --train /data/tut/train/train_trees.txt \
    --dev /data/tut/valid_trees.txt \
    --test /data/tut/test_trees.txt  \
    --encoder=bert  \
    --bert=xlm-roberta-large  \
    --lr=5e-5  \
    --lr-rate=20  \
    --epochs=10  \
    --update-steps=4 \
    --wandb

Output of the logger:

[2024-11-07 21:19:33 INFO]
amp: false
bert: xlm-roberta-large
bert_pooling: mean
binarize: false
buckets: 32
build: true
cache: false
checkpoint: false
clip: 5.0
dev: /data/tut/valid_trees.txt
device: '0'
dist: ddp
embed: glove-6b-100
encoder: bert
encoder_dropout: 0.1
epochs: 10
feat: null
fix_len: 20
implicit: false
lr: 5.0e-05
lr_rate: 20
max_len: null
mbr: false
min_freq: 2
mix_dropout: 0.0
mlp_dropout: 0.33
mode: train
n_bert_layers: 4
n_label_mlp: 100
n_span_mlp: 500
path: model
seed: 1
test: /data/tut/test_trees.txt
threads: 16
train: /data/tut/train_trees.txt
update_steps: 4
wandb: true
warmup: 0.1
workers: 0

[2024-11-07 21:19:33 INFO] Building the fields
[2024-11-07 21:20:27 INFO] Tree(
 (words): SubwordField(vocab_size=250002, pad=<pad>, unk=<unk>, bos=<s>, eos=</s>)
 (trees): RawField()
 (charts): ChartField(vocab_size=2)
)
[2024-11-07 21:20:27 INFO] Building the model

The issue is that training does not continue from this step onward even though the process runs for up to two days. Training is not interrupted by errors, but eventually stopped manually.

I am trying to train a model on Italian so this is how the training data looks like:

(S (VP (VMA~PA Bruciata) (NP-EXTPSBJ-233 (NP (ART~DE la) (NOU~CS sede)) (PP (PREP del) (NP (ART~DE del) (NOU~CS partito) (ADJ~QU democratico)))) (CONJ mentre) (S (NP-SBJ (ART~DE i) (NOU~CP reparti) (ADJ~QU antisommossa)) (NP (PRO~RI si)) (VP (VAU~RE sono) (VP (VMA~PA ritirati) (PP-LOC (PREP dalla) (NP (ART~DE dalla) (NOU~CA citta'))))))) (NP-SBJ (-NONE- *-233)) (. .))
(S (NP-SBJ (NOU~PR Valona)) (VP (-NONE- *) (PP-PRD (PREP IN_MANO_A) (NP (ART~DE ai) (NOU~CP dimostranti)))) (. .))
(S (VP (VMA~RE Slitta) (PP-LOC (PREP a) (NP (NOU~PR Tirana))) (NP-EXTPSBJ-433 (NP (ART~DE la) (NOU~CS decisione)) (PP (PREP sullo) (NP (NP (ART~DE sullo) (NOU~CS stato)) (PP (PREP di) (NP (NOU~CS emergenza))))))) (NP-SBJ (-NONE- *-433)) (. .))
(S (NP-SBJ (NP (ART~DE Il) (NOU~CS Governo)) (PP (PREP di) (NP (NOU~PR Berisha)))) (VP (VMA~RE appare) (PP-PRD (PREP in) (NP (NOU~CA difficolta')))) (. .))
(S (VP (VMA~RE Ha) (NP (ART~IN un) (NOU~CS nome) (ADJ~QU dolce)) (NP-EXTPSBJ-533 (NP (ART~DE l') (ADJ~OR ultima) (NOU~CS tappa)) (PP-LOC (PREP prima) (PP (PREP di) (S (VP (VMA~IN entrare) (PP-LOC (PREP a) (NP (NOU~PR Valona)))) (NP-SBJ (-NONE- *))))))) (NP-SBJ (-NONE- *-533)) (. .))
(NP (ART~DE La) (NP (NP (NOU~CS Collina)) (PP (PREP degli) (NP (ART~DE degli) (NOU~CP Ulivi))) (, ,) (NP (PP (PREP in) (NP (NOU~CS albanese))) (NOU~PR Chafa) (NOU~PR Kushojviz))) (. .))
(S (S (NP-SBJ (PRO~DE Questa)) (VP (VMA~RE e') (ADVP-TMP (ADVB ancora)) (' ') (NP-PRD (NP (NOU~CS terra)) (PP (PREP di) (NP (PRO~ID nessuno)))) (' '))) (, ,) (S (ADVP-LOC (ADVB sotto)) (PRN (, ,) (PP (PREP sulla) (NP (ART~DE sulla) (NP (NOU~CS discesa)) (SBAR (NP-1333 (PRO~RE che)) (S (NP-SBJ (-NONE- *-1333)) (VP (VMA~RE schiude) (NP (ART~DE l') (NOU~CS orizzonte)) (PP (PREP al) (NP (ART~DE al) (NOU~CS mare)))))))) (, ,)) (NP-LOC (PRO~LO ci)) (VP (VMA~RE sono) (NP-EXTPSBJ- (NP-2333 (NOU~CP barricate)) (, ,) (NP (NP (NOU~CP bagliori)) (PP (PREP di) (NP (NOU~CP fuochi))) (CONJ e) (NP (NOU~CS gente) (ADJ~QU inferocita))))) (NP-SBJ (-NONE- *-2333))) (. .))
(S (S (NP-SBJ (NOU~PR Valona)) (VP (VMA~RE e') (PP-PRD (PREP IN_MANO_A) (NP (ART~DE alla) (NOU~CS folla))) (, ,) (PP (PREP fuori) (PP (PREP da) (NP (ADJ~IN ogni) (NOU~CS controllo)))))) (, ,) (S (NP-SBJ-1333 (ART~DE la) (NOU~CS polizia)) (VP (VAU~RE e') (VP (VMA~PA scomparsa) (, ,) (S-PRD (VP (VMA~PA rintanata) (PP (PREP nelle) (NP (ART~DE nelle) (NOU~CP caserme)))) (NP-SBJ (-NONE- *-1333)))))) (. .))
(S (NP-SBJ (NP (ART~DE Lo) (NOU~CS stato)) (PP (PREP di) (NP (NOU~CS emergenza)))) (VP (VMA~RE e') (PP-LOC (PREP nei) (NP (ART~DE nei) (NOU~CP fatti)))) (. .))
(S (S (NP-SBJ (NP (ART~DE I) (NOU~CP posti)) (PP (PREP di) (NP (NOU~CS blocco)))) (VP (VMA~RE cominciano) (PP (PREP ad) (NP (NP (ADVB appena) (NUMR 60) (NOU~CP-933 chilometri)) (PP (PREP dalla) (NP (ART~DE dalla) (NOU~CS capitale))))))) (CONJ e) (S (PP (PREP negli) (NP (NP (NUMR 80)) (NP (ART~DE negli) (ADJ~DI altri) (-NONE- *-933)) (PP (PREP di) (NP (NP (NOU~CP strade) (ADJ~QU infami)) (PP (PREP fra) (NP (NOU~PR Tirana) (CONJ e) (NP (NP (ART~DE il) (NOU~CS porto)) (PP (PREP sull') (NP (ART~DE sull') (NOU~PR Adriatico)))))))))) (NP (PRO~RI si)) (VP (VMA~RE toccano) (NP (NP (PRDT tutte) (ART~DE le) (NOU~CP stazioni)) (PP (PREP della) (NP (ART~DE della) (NOU~CS rivolta) (ADJ~QU albanese)))))) (. .))

Any help or pointers on how to debug this issue are greatly appreciated.