Open HuYunhai-Alex opened 1 month ago
Can you elaborate on what system you are doing this? As I can see matchness is quite high, so this problem shouldn't occur
Yes, the acceptance rate is normal and should be accelerated. Can you rule out whether it is a problem with the sss mode and try essg? In addition, you can update the environment and re-search the skipped layers.
Using the skiplayer provided by the project to run CodeLlama2-13B and LLaMA2-13B-Chat, the speculated decode time in evaluate_sum and evaluate_code is significantly longer than the base model. Could you please explain why this might be the case?