princeton-nlp / LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
MIT License
378 stars 37 forks source link

A simple question about the usage of argument `max_samples` in `less/data_selection/get_info` #34

Closed cafeii closed 1 month ago

cafeii commented 1 month ago

I'm quite confused about the argument max_samples . According to my understanding to the paper, step 2 should compute all candidates' Adam LoRA grads, but it was set to default 200 in less/scripts/get_info/grad/get_train_lora_grads.sh. What's the purpose of setting this value to 200? Is that the default setting for reproduction? I guess it should be set to default None. Please tell me if I got anything wrong.

xiamengzhou commented 1 month ago

Yes, you are correct! It should be set to None. I updated the script