Closed USSiamaboat closed 2 weeks ago
New dataset.py
breaks the code in train.py
, which relies on the old dataset.py
, but the code in train.py
doesn't run anyways
Also tokenizer_10m/
should not be getting pushed to git. Maybe a tarball of that folder could, but the actual folder should not be getting pushed
Also
tokenizer_10m/
should not be getting pushed to git. Maybe a tarball of that folder could, but the actual folder should not be getting pushed
Implemented the tar thing for now. Where would the files go if we don't put it in the github?
Typically would use AWS S3 for something like this but we can use Drive for our project. Gave you access to a data
folder in the Mini Copilot drive folder
Force pushed to undo bad merge