IntelLabs / nlp-architect

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
https://intellabs.github.io/nlp-architect
Apache License 2.0
2.94k stars 448 forks source link

question: [Q8Bert experiment Setting] #219

Open daumkh402 opened 3 years ago

daumkh402 commented 3 years ago

Hello, I read the Q8Bert paper and have tried to reproduce the experiment results. But, on some GLUE tasks ( e.g cola, mrpc ), the differences between the fp32 results and quantized ones are much larger than the differences reported in the paper. I tried sweeping initial learning rate but still the result was still far from the reported results.

image

So, I want to ask you if the experiment on Q8bert was done with default parameters set inside nlp-architect code as below.

image

If not, could you tell me the experiment setting.

ofirzaf commented 3 years ago

Hi,

What version of nlp_architect and transformers did you use to run the experiments?

Please note that both MRPC and CoLa tasks are known to be unstable in their results.

The experiments in the paper were done using a very early version of HF/transformers, here are the official results from HF relevant at the time of writing the paper: https://huggingface.co/transformers/v1.0.0/examples.html#glue-results-on-dev-set

daumkh402 commented 3 years ago

Hi, The version of nlp_architect is 0.5.5. The version of transformers is 2.4.1.

Thank you.