yaojin17 / Unlearning_LLM

[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
MIT License
33 stars 0 forks source link

Question about the performance on downstream tasks. #6

Closed tbozhong closed 3 months ago

tbozhong commented 3 months ago

Hi theređź‘‹

I've noticed that currently, there is only a general dataset available for evaluating overall performance on downstream tasks. Are there any scripts that evaluate the specific performance of individual downstream tasks?

I appreciate your assistance and look forward to your response!

yaojin17 commented 3 months ago

Unfortunately, we are unable to release the scripts for evaluating specific downstream tasks. For your evaluation needs, please consider using open-source LLM evaluation frameworks or refer to scripts provided by the developers of the respective downstream datasets.

tbozhong commented 3 months ago

Thanks for your response!

Could you please provide more detailed information regarding the composition of the general dataset? For instance, I would like to know which data ranges, such as indices 1 to 1000, correspond to specific subsets like MMLU.

yaojin17 commented 3 months ago

Apologies for any confusion. The general set is sampled from the retain set of the LLM's pre-training dataset. It is used to evaluate the unlearned model's perplexity on the retain set. It is not associated with any specific downstream task, such as MMLU.

tbozhong commented 3 months ago

Thanks for your timely response. I have no more questions.