Open guangyaodou opened 7 months ago
Hey @guangyaodou - this was a deliberate inclusion at the time of code development, since Llama is pretty big and we didn't want scripts to run into OOM issues with the main and auxiliary models being on the same device. That being said, it is probably best to not override the config as you said. I can handle this change in the next version, but would also be great if you want to submit a PR for this!
Hi,
Sorry I forgot submitting a PR for this. Just submitted the PR!
Hi,
I found a strange line here in your code - https://github.com/iamgroot42/mimir/blob/main/mimir/models.py#L183
Is this a bug or is there any other particular reason? It is currently overriding what I am specifying in the config file.
Thanks.