Open Hannibal046 opened 5 months ago
Huh, yeah, this is really strange. Your outputs look marginally worse which makes me think this might be related to #40.
I'll try to think about what could have changed. I guess anything that changes the floating-point output of a transformer (even if it's the within a very small L2 of the original) could slightly decrease performance, which you're observing. Maybe we just need to pin to the versions (huggingface & pytorch) from a year ago when these models were trained.
Hi, @jxmorris12 Thanks so much for this insightful work! After trying the demo, I got the following results, which doesn't align with the output provided. So I want to confirm if the model updated or codebase refactored?