Moved data shuffling to just before tokenization step

Codecov Report

Merging #110 (7b3172c) into main (51e988b) will decrease coverage by 0.05%. Report is 6 commits behind head on main. The diff coverage is 71.42%.

:exclamation: Current head 7b3172c differs from pull request most recent head 21db0bf. Consider uploading reports for the commit 21db0bf to get more accurate results

@@            Coverage Diff             @@
##             main     #110      +/-   ##
==========================================
- Coverage   81.08%   81.04%   -0.05%     
==========================================
  Files          32       32              
  Lines        2141     2147       +6     
==========================================
+ Hits         1736     1740       +4     
- Misses        405      407       +2

Files Changed	Coverage Δ
tuned_lens/scripts/ingredients.py	`85.56% <71.42%> (-0.63%)`	:arrow_down:

AlignmentResearch / tuned-lens

Moved data shuffling to just before tokenization step #110

Codecov Report