nikolas-rauscher / ARDI-Scholarly-QALD

ARDI-Project Scholarly QALD Challenge 2024
GNU General Public License v3.0
2 stars 0 forks source link

Loading and running 3 LLMs with single kernel (48 GB GPU) #36

Closed sefeoglu closed 3 months ago

sefeoglu commented 3 months ago

Hi @nikolas-rauscher, Running 3 LLMs in a single kernel with 48 GB SPU might not possible, since Llama will have already allocated most of the GPU memories, so could you please change it for a single language model for each run? I can change the model name and prompt template from config ini and also created another issue to fix some parameters. Thanks.

https://github.com/nikolas-rauscher/ARDI-Scholarly-QALD/blob/a6b6691cf3a15fd3f3f67136fd9f06d6302573b1/src/models/zero_shot_prompting_pipeline.py#L185-L197

nikolas-rauscher commented 3 months ago

Hey @sefeoglu, have you already tried it? I actually call the models one after the other, so they are completely separate. I have now made sure that the models are no longer in memory, and then they should run one after the other.

I mean at the end it's the same so we can remove it, but I just thought it would be nicer that way