Closed 2024WY closed 1 month ago
I noticed that your top-3 accuracy on the training set is only about 0.8, which is relatively low. What is your training accuracy on the English dataset? If it is close to the accuracy on the Chinese dataset, it could be that the structure or size of the draft model is not suitable. If the English accuracy is significantly higher than the Chinese accuracy, it is possible that your base model is not sufficiently trained on Chinese, and its features cannot effectively capture the semantic information of Chinese.
I had modified the preprocess_function.And the used data is the sft data of the model. The pictures are the results of the training and testing in training:
but I use Spec-Bench project to test on Chineses dataset, the Mean accepted tokens is only 2.5994782086414654, not good enough.
Anything else to pay attention, Can you give some advice?Thanks!