princeton-nlp / LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
MIT License
307 stars 25 forks source link

Using Multiple GPUs #7

Closed SD325 closed 2 months ago

SD325 commented 5 months ago

How can we utilize multiple GPUs for the gradient feature collection step? The current implementation only works with a single GPU.

xiamengzhou commented 5 months ago

Hi, currently gradient computation with multiple GPUs is not supported mostly because I have not found a way to use vmap to calculate gradients for each instance within a batch. If you'd like to use multiple GPUs, you can split your data into multiple files and launch a separate job for each subset of data.

simplelifetime commented 3 months ago

Hi, If I launch separate jobs, will the projection matrix remain the same when I set a same number for model_id?

xiamengzhou commented 3 months ago

Yes! It will