AlignmentResearch / tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer
https://tuned-lens.readthedocs.io/en/latest/
MIT License
432 stars 47 forks source link

Refactors and adds tests for`tuned_lens.data` #78

Closed levmckinney closed 1 year ago

levmckinney commented 1 year ago

This pull request should hopefully solve #60 and correct some issues with the nats_to_bpb ratio calculation.

First I believe #60 was actually happening during the computation of the nats_to_bpb ratio and not during the actual tokenization step. Tokenization and nats_to_bpb ratio computation have now been combine into a single function, hopefully resolving #60. This combination also helps correct a bias in the nats_to_bpb ratio calculation that was previously caused by discarding the final batch.