EleutherAI / elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.
MIT License
178 stars 33 forks source link

Can't run elicit and sweep; process is stuck with message: "Waiting for x GPUs with at least X GB of free memory. 0 GPUs currently available." #217

Closed gekaklam closed 1 year ago

gekaklam commented 1 year ago

Hi,

I downloaded the codebase on my laptop, where I have an Nvidia GeForce RTX 3080 Mobile16GB GPU. When trying to run elicit or sweep, no matter the model that I use, pretty much always, the process is stuck with the following message:

Waiting for 1 GPUs with at least 15.46 GB of free memory. 0 GPUs currently available.

When trying to do the same thing to the Jessica cluster, I'm getting the same, however, the message is:

Waiting for 8 GPUs with at least 46.37 GB of free memory. 0 GPUs currently available.

In order to recreate the error, one simply needs to clone the codebase to a new env and try to run the basic commands.

Any suggestions on how to go about this?

norabelrose commented 1 year ago

This is probably because your GUI is taking more than 10% of the VRAM on your GPU, and right now there's a hardcoded default min VRAM of 90%. You can customize this with min_gpu_mem

derpyplops commented 1 year ago

Maybe could update the info message to be more informative?