raymin0223 / fast_robust_early_exit

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
51 stars 8 forks source link

adaptive threshold estimation #16

Open cool-xiang opened 1 week ago

cool-xiang commented 1 week ago

Thank you very much for your wonderful project! Regarding the adaptive threshold estimation here, in my actual testing, the threshold obtained remained unchanged. When tested on the BIG PATENT dataset, the threshold remained constant at 0.9 for the first 100 samples instead of adaptive transformation as described in the paper. May I ask if there was a problem with my testing? Thank you very much! " self.decoder.bmm_model.fit(X, Y) self.decoder.bmm_threshold = self.decoder.bmm_model.predict_proba(0.3, 0.9) " self.decoder.bmm_threshold, this variable remains constant at 0.9 in my test.

raymin0223 commented 1 week ago

Hi @cool-xiang,

sorry for the late reply. That's somewhat weird. Could you share the script file? I assumed that you add --use_adapt_threshold=True.

Also, can you check intermediate outputs inside predict_proba method if it is still changed but eventually becomes 0.9?