Offload to the cpu - Githubissues

EQ-bench / EQ-Bench

A benchmark for emotional intelligence in large language models

MIT License

195 stars 17 forks source link

Offload to the cpu #29

Closed djstrong closed 4 months ago

djstrong commented 4 months ago

With this config:

openchat-gemma, , openchat/openchat-3.5-0106-gemma, , , 1, transformers, , ,
Nous-Hermes-2-SOLAR-10.7B, , NousResearch/Nous-Hermes-2-SOLAR-10.7B, , , 1, transformers, , ,

I have got warning for the second model: WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu. and evaluation is very slow.

I am running it on GPU with 40GB.

If only Nous-Hermes-2-SOLAR-10.7B is in the config then everything is fine. I guess the previous model is not removed before loading the next one - I see the del model in cleanup, but it does nothing actually.

sam-paech commented 4 months ago

Since merging your change I'm still getting this issue. I tried a few other things and haven't been able to figure out how to get it to release memory after a run. I assume this must be related to a transformers or pytorch update because it wasn't an issue before.

Reopening for now.

sam-paech commented 4 months ago

I pushed a change that I think should fix this. The memory releasing part must have gotten mixed up in the refactor so it was only deleting the model per benchmark run, not per iteration (but reloading the model every iteration).

sam-paech commented 4 months ago

Should be fixed now, at least it is from my testing. Thanks for the report & contribution!

djstrong commented 4 months ago

I was running with one iteration only.

sam-paech commented 4 months ago

I was running with one iteration only.

You're right, multiple runs and multiple iterations were not releasing memory. Both should be fixed now since it's releasing memory after every iteration.