Closed XNLHZ closed 7 months ago
Hello,
While the architecture of R2Gen is not directly implemented in ViLMedic, you can replicate similar results with this simple baseline:
python bin/train.py config/RRG/baseline-mimic.yml \
dataset.seq.processing=ifcc_clean_report \
dataset.image.root=data/RRG/mimic-cxr/findings/ \
dataset.seq.root=data/RRG/mimic-cxr/findings/ \
dataset.seq.file=findings.tok \
dataset.seq.tokenizer_max_len=128 \
dataset.image.file=image.tok \
dataset.image.image_path=data/images/ \
dataset.image.multi_image=3 \
model.cnn.backbone=densenet121 \
model.cnn.visual_projection.in_features=1024 \
model.cnn.visual_projection.out_features=768 \
trainor.batch_size=16 \
trainor.grad_accu=8 \
trainor.optim_params.lr=0.0003 \
trainor.optimizer=Adam \
trainor.early_stop_metric=bertscore \
trainor.early_stop=10 \
validator.batch_size=8 \
validator.beam_width=2 \
validator.metrics='[bertscore]' \
validator.splits='[validate]' \
ckpt_dir=ckpt \
name=nll_findings_bertscore_128
You can generate the tok file using the scripts in `data/make_datasets/`
Best,
Thanks for the codebase, and I would like to ask if the newer implementation of the R2Gen model is included in this codebase.