Open rohitk-cognizant opened 5 months ago
Hi @rohitk-cognizant,
To train wmt22-cometkiwi-da
you just have to run:
comet-train --cfg configs/models/{your_model_config}.yaml
Your configs should be something like this:
unified_metric:
class_path: comet.models.UnifiedMetric
init_args:
nr_frozen_epochs: 0.3
keep_embeddings_frozen: True
optimizer: AdamW
encoder_learning_rate: 1.0e-06
learning_rate: 1.5e-05
layerwise_decay: 0.95
encoder_model: XLM-RoBERTa
pretrained_model: microsoft/infoxlm-large
sent_layer: mix
layer_transformation: sparsemax
word_layer: 24
loss: mse
dropout: 0.1
batch_size: 16
train_data:
- TRAIN_DATA.csv
validation_data:
- VALIDATION_DATA.csv
hidden_sizes:
- 3072
- 1024
activations: Tanh
input_segments:
- mt
- src
word_level_training: False
trainer: ../trainer.yaml
early_stopping: ../early_stopping.yaml
model_checkpoint: ../model_checkpoint.yaml
Hi @ricardorei ,
Thanks for the update. Can I use the same training parameters mentioned in master branch trainer.yaml file?
Hmm maybe you should change them a bit. For example to train on a single GPU (which is usually faster) and with precision 16 use this:
accelerator: gpu
devices: 1
# strategy: ddp # Comment this line for distributed training
precision: 16
You might also want to consider reducing the accumulate_grad_batches
to 2 instead of 8
accumulate_grad_batches: 2
What is the format that the data should look like?
Hi Team,
Can you share the training data and training scripts used for wmt22-cometkiwi-da. We want it reference for training with our own sample reference data.