namisan / mt-dnn

Multi-Task Deep Neural Networks for Natural Language Understanding
MIT License
2.22k stars 412 forks source link

How to use knowledge distillation on a custom dataset? #217

Closed IoSylar closed 3 years ago

IoSylar commented 3 years ago

Hello ,thanks for sharing of your work. I' m using MT-DNN on a custom datasets. I'm following the tutorials so I did single task learning and MTL, but I fail to do KD. I don't know if it is necessary a processing of data or others previous steps. I 've simply used !python train.py --task_def tutorials/tutorial_task_def.yml --data_dir tutorials/bert_base_uncased/ --init_checkpoint="mt_dnn_models/mt_dnn_kd_large_cased.pt" --train_datasets MyTask --test_datasets MytTask --epochs=1 --batch_size=1 Later, I have had an error: RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc). I'm using Google Colab and I've tried differents batch size.