princeton-nlp / LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
MIT License
306 stars 25 forks source link

some question about calulate socre? #14

Closed smashfan closed 2 months ago

smashfan commented 3 months ago
N_SUBTASKS = {"mmlu": 57, "bbh": 27, "tydiqa": 9} 
influence_score = influence_score.reshape(
            influence_score.shape[0], N_SUBTASKS[target_task_name], `-1).mean(-1).max(-1)[0]

what is meaning N_SUBTASKS , why do this? Can I change it to " influence_score =influence_score.mean(-1)[0]" ?

xiamengzhou commented 3 months ago

In our experiments, all the tasks we utilized possessed an inherent substructure. If your validation data lacks this substructure, you can simply set N_SUBTASKS[your_task] = 1, and it should work with the code!