xv44586 / toolkit4nlp

transformers implement (architecture, task example, serving and more)
Apache License 2.0
97 stars 18 forks source link

distilling_knowledge_bert.py #4

Closed sssdjj closed 3 years ago

sssdjj commented 3 years ago

distilling_knowledge_bert.py 你好这个人文件中 y_soften = K.softmax(y_train_logits / Temperature).numpy()

new_y_train = np.concatenate([y_train, y_soften], axis=-1) 这两句的作用是啥,我看之后没有用它,去掉有影响吗

xv44586 commented 3 years ago

distilling_knowledge_bert.py 你好这个人文件中 y_soften = K.softmax(y_train_logits / Temperature).numpy()

new_y_train = np.concatenate([y_train, y_soften], axis=-1) 这两句的作用是啥,我看之后没有用它,去掉有影响吗

这里是学习平滑后的logits,即y_soften。代码上是我实验后忘了改回来,现已修改