Closed dengg1013 closed 8 months ago
maybe the epochs is too few to train the model?
RUN configuration: --token_level word-level --model_type lstm --model_dir dir_base --task all --data_dir data --attention_mode label --do_train --do_eval --num_intent_detection --use_crf **I use lstm encoder ,the result is great as following:
But got low score using bert model(train base model first,then train misca),why?**
Hi, Thanks for your interest! We have updated the default hyper-parameter settings in the README. When training with BERT model, we need to scale down the learning rate (around 1e-5). So, we have updated this setting. Thanks,
I first train the base model using bert backbone,the following command is python main.py --token_level word-level --model_type bert --model_dir dir_base --task my dataset --data_dir data --attention_mode label --do_train --do_eval --num_intent_detection --use_crf, and then loads dir_base model,the following command is python main.py --token_level word-level --model_type bert --model_dir misca --task my dataset --data_dir data --attention_mode label --do_train --do_eval --num_intent_detection --use_crf \ --base_model dir_base --intent_slot_attn_type coattention, however, the result still low.