iamgroot42 / mimir

Python package for measuring memorization in LLMs.
https://iamgroot42.github.io/mimir/
MIT License
125 stars 23 forks source link

Some issue in model.py #20

Open guangyaodou opened 7 months ago

guangyaodou commented 7 months ago

Hi,

I found a strange line here in your code - https://github.com/iamgroot42/mimir/blob/main/mimir/models.py#L183

Is this a bug or is there any other particular reason? It is currently overriding what I am specifying in the config file.

Thanks.

iamgroot42 commented 7 months ago

Hey @guangyaodou - this was a deliberate inclusion at the time of code development, since Llama is pretty big and we didn't want scripts to run into OOM issues with the main and auxiliary models being on the same device. That being said, it is probably best to not override the config as you said. I can handle this change in the next version, but would also be great if you want to submit a PR for this!

guangyaodou commented 1 month ago

Hi,

Sorry I forgot submitting a PR for this. Just submitted the PR!