Closed dlits21 closed 1 year ago
Please change the pull request to the dev branch.
Currently, the example creates the dataloader in the dataset class, for conistency, could you write the example similarly to the other dataset examples. Using a consistent method (first load the dataset, then load that dataset into your dataloader) allows users to use the same method across all datasets of NeuroBench. You can look at the MackeyGlass implementation, where Subset is used.
Changed Dataloader
Hi Noah, test cases for the dataloader? And testing all the methods or anything specific?
The minimum would be just be checking IO to make sure the shape is what you'd expect. Like just a simple test case that checks the output of the dataset wrapped in a DataLoader so that if something changes down the road that breaks this we know. If there's hyperparameters that you can pass to the dataset that changes the overall shape of the data, maybe write tests that check those too. That should give pretty good coverage but the more thoroughly you can test each function the better. Example unit tests can be found in the tests folder.
Cool thanks! I at first misunderstood what you meant by test cases
@EssentiallyNoah I added some tests about the hyperparameters @korneelf1 @jasonlyik I changed the getter method to return the 3D version for both input and output. Should Velocity be a window as well? Arindam's group used to only predict a single velocity. For IMEC, we reported the window, but are flexible...
@dlits21 Looks good, I think we should check with @vinniesun whether the always-3d mode is compatible with their models.
I think for now velocity as a single point is fine for a standard benchmark.
For tests, Noah means to add pytest in the test folder, would you be able to implement checks there?https://github.com/NeuroBench/neurobench/blob/dev/tests/test_datasets.py
Ok then I can change the code to a single point for now. I can also add tests later today
@jasonlyik Let me pass this on to Biyan to check if it's compatible with her model.
@dlits21 Vincent and I cleared on the dataset format, once we validate your dataset changes with the tests we can clear this PR
Sounds good, I also added now a simple test for the dataloader checking the dimensions
@dlits21
There is a difference in the label format from the old dataloader "3D" mode and what it's now changed to.
The old format for labels outputs one timestep per sample, the new one outputs num_steps
timesteps per sample.
Old: https://github.com/NeuroBench/neurobench/blob/dev/neurobench/datasets/primate_reaching.py#L113
label = self.labels[:, idx]
New: https://github.com/NeuroBench/neurobench/blob/dev_phueber/neurobench/datasets/primate_reaching.py#L118
return self.samples[:, mask].transpose(0, 1), self.labels[:, mask].transpose(0, 1)
The benchmark.py
with ANN follows the old label format, and the new benchmark_SNN.py
works with the new label format.
How should we make this consistent?
Also pinging @vinniesun, which label format do your models follow?
Hi Jason, That is what i meant with 3D outputs before...
For testing our SNN this does not matter because then I can change the output dimensions accordingly. For training the SNN a timeseries would be important though.
@dlits21 Not sure I completely understand, are you suggesting that we use the multi-step label format because it assists with training?
I am saying that for testing our network it does not matter. And I can change it in a couple of minutes.
But maybe other (recurrent) networks need multi step labels (to counter vanishing / exploding gradients)
@jasonlyik the data format we use is the new one (data_point, 2).
@dlits21 Ok, would you be able to restore the label format to (data_point, 2) for consistency with the Benchmark? Could the multi-step labeling be implemented in the training pipeline, rather than in the dataset itself?
@jasonlyik Done
Well it could be done, but it would require some hacking around
@dlits21 @vinniesun Merging this PR as the dataset format for the primate reaching task is stable.
Model parameters will be submitted tonight