ApolloResearch / rib

Library for methods related to the Local Interaction Basis (LIB)
MIT License
2 stars 0 forks source link

Throw a warning if only the IGNORE field is mismatched #252

Closed stefan-apollo closed 10 months ago

stefan-apollo commented 10 months ago

Only throw a warning if the IGNORE field is mismatched, instead of raising an error.

Description

Allows loading transformers that use a different causal masking strategy (-1e5 rather than -inf). Useful for loading models before the convention change if they don't break with -1e5 (such as mod add models). Should not be used for Pythia.

Prints a warning whenever this occurs.

Motivation and Context

I sometimes need to load old mod add models, want to be able to reproduce results. Backwards compatibility.

How Has This Been Tested?

I can load models now; I checked that mod add does not have large attention scores.

Does this PR introduce a breaking change?

No

stefan-apollo commented 10 months ago

Implemented logging as requested on Slack, see here for screenshots.