Open daumkh402 opened 3 years ago
Hi,
What version of nlp_architect
and transformers
did you use to run the experiments?
Please note that both MRPC and CoLa tasks are known to be unstable in their results.
The experiments in the paper were done using a very early version of HF/transformers, here are the official results from HF relevant at the time of writing the paper: https://huggingface.co/transformers/v1.0.0/examples.html#glue-results-on-dev-set
Hi, The version of nlp_architect is 0.5.5. The version of transformers is 2.4.1.
Thank you.
Hello, I read the Q8Bert paper and have tried to reproduce the experiment results. But, on some GLUE tasks ( e.g cola, mrpc ), the differences between the fp32 results and quantized ones are much larger than the differences reported in the paper. I tried sweeping initial learning rate but still the result was still far from the reported results.
So, I want to ask you if the experiment on Q8bert was done with default parameters set inside nlp-architect code as below.
If not, could you tell me the experiment setting.