Closed tingwl0122 closed 4 months ago
It looks like it is not straightforward to skip Darglint for some given folders...
Ignore the ruff rule [PLR0914, PLR0915] for the functions in train.py
.
@xingjian-zhang please take a rough look at the structures. Under the benchmark folder, we will have multiple datasets, with some datasets sharing some models (e.g., resnet or GPT). We are thinking about having a benchmark/models
folder for the model code and a benchmark/<dataset name>
folder for the training and eval code for each dataset. Please comment in the review if you have suggestions.
Thanks for the review, @xingjian-zhang!
For the first comment, since it is more like a script to test the entire pipeline, including file downloading, preprocessing, etc. So I only locally tested it and didn't want it to pass through pytest
. And according to #51's structure, we will put some working scripts under dattri/scripts/
.
Also, I will add a simpler test file under test/dattri/benchmark/
to test the basic functionality of dattri/benchmark/maestro/train.py
(possibly split some functions to other places)
For the second comment, I think @TheaperDeng has written something similar to this in #51 (dattri/scripts/retrain.py
).
Thanks! #51 is pretty similar to what I am thinking.
Description
This PR will implement the train/eval functions and scripts for MusicTransformer(MT) on MAESTRO dataset.
1. Motivation and Context
To implement the corresponding benchmark experiment. Note: #52 discusses importing the MusicTransformer models/training/evaluation function into our repo.
2. Summary of the change
dattri/benchmark/maestro
folder, which handles the MT training, loss calculation, and MAESTRO dataset creationtest/dattri/benchamrk/test_maestro.py
to test the functions indattri/benchmark/maestro.py
dattri/scripts/retrain.py
to handle this new benchmark experimentNote:
test/dattri/benchmark
to only test the basic functionalities.3. What tests have been added/updated for the change?