kanishkamisra / minicons

Utility for behavioral and representational analyses of Language Models
https://minicons.kanishka.website
MIT License
117 stars 29 forks source link

Update cwe.py #53

Closed grvkamath closed 1 month ago

grvkamath commented 6 months ago

small fixes to make the CWE class compatible with Llama 2 models!

netlify[bot] commented 6 months ago

Deploy Preview for pyminicons canceled.

Name Link
Latest commit 1f76ea427fdba84b942ee51ebfc82de20f806bf3
Latest deploy log https://app.netlify.com/sites/pyminicons/deploys/667d96f3908d45000870ad42
kanishkamisra commented 6 months ago

Hey thanks for noticing this and proposing a fix! I think the fix works, but was wondering if we should instead do the gpt2 thing only if the model is GPT2/GPT-neo/GPT-J? I think most models nowadays are more like llama so it might be better if we condition on presence of gpt rather than absence? Ff to make changes and re-submit, I'd for this to be your first PR haha :)

grvkamath commented 2 months ago

Hey, I've added more changes that should automate this process, i.e. classify a tokenizer as being more like the Llama tokenizers or more like the GPT2/Neo/J tokenizers. Lmk if you have any extra suggestions!

kanishkamisra commented 1 month ago

This is a brilliant solution! I like the idea of "inferring" the tokenizer behavior! Might even be useful when I eventually integrate tiago pimentel and clara meister's "proper way of computing word probabilities"! Thanks :D

grvkamath commented 1 month ago

Thanks so much!!! Glad you think it's a good fix. Let me know if any issues pop up with it!