Closed richardzhangy26 closed 6 months ago
Thanks for your interest in our work! Our model first trains the "base" model without coattention mechanism first, and loads this pretrained base model to learn with coattention afterward. Hope it is helpful for you.
when I train the MISCA model, the following command is
![image](https://github.com/VinAIResearch/MISCA/assets/77674465/f43b1d89-df4f-4d68-bfd4-3639feabbd00)
python main.py --token_level word-level \ --model_type roberta \ --model_dir misca \ --task mixatis \ --data_dir data \ --attention_mode label \ --do_train \ --do_eval \ --num_train_epochs 100 \ --intent_loss_coef 0.5 \ --learning_rate 1e-5 \ --num_intent_detection \ --use_crf \ --base_model dir_base \ --intent_slot_attn_type coattention
Finally, I got a low expected f1 score. What's wrong with it?