SNN IMEC - Githubissues

dlits21 commented 1 year ago

Model parameters will be submitted tonight

korneelf1 commented 1 year ago

Please change the pull request to the dev branch.

korneelf1 commented 1 year ago

Currently, the example creates the dataloader in the dataset class, for conistency, could you write the example similarly to the other dataset examples. Using a consistent method (first load the dataset, then load that dataset into your dataloader) allows users to use the same method across all datasets of NeuroBench. You can look at the MackeyGlass implementation, where Subset is used.

dlits21 commented 1 year ago

Changed Dataloader

dlits21 commented 1 year ago

Hi Noah, test cases for the dataloader? And testing all the methods or anything specific?

EssentiallyNoah commented 1 year ago

The minimum would be just be checking IO to make sure the shape is what you'd expect. Like just a simple test case that checks the output of the dataset wrapped in a DataLoader so that if something changes down the road that breaks this we know. If there's hyperparameters that you can pass to the dataset that changes the overall shape of the data, maybe write tests that check those too. That should give pretty good coverage but the more thoroughly you can test each function the better. Example unit tests can be found in the tests folder.

dlits21 commented 1 year ago

Cool thanks! I at first misunderstood what you meant by test cases

dlits21 commented 1 year ago

@EssentiallyNoah I added some tests about the hyperparameters @korneelf1 @jasonlyik I changed the getter method to return the 3D version for both input and output. Should Velocity be a window as well? Arindam's group used to only predict a single velocity. For IMEC, we reported the window, but are flexible...

jasonlyik commented 1 year ago

@dlits21 Looks good, I think we should check with @vinniesun whether the always-3d mode is compatible with their models.

I think for now velocity as a single point is fine for a standard benchmark.

For tests, Noah means to add pytest in the test folder, would you be able to implement checks there?https://github.com/NeuroBench/neurobench/blob/dev/tests/test_datasets.py

dlits21 commented 1 year ago

Ok then I can change the code to a single point for now. I can also add tests later today

vinniesun commented 1 year ago

@jasonlyik Let me pass this on to Biyan to check if it's compatible with her model.

jasonlyik commented 1 year ago

@dlits21 Vincent and I cleared on the dataset format, once we validate your dataset changes with the tests we can clear this PR

dlits21 commented 1 year ago

Sounds good, I also added now a simple test for the dataloader checking the dimensions

jasonlyik commented 1 year ago

@dlits21 There is a difference in the label format from the old dataloader "3D" mode and what it's now changed to. The old format for labels outputs one timestep per sample, the new one outputs num_steps timesteps per sample.

Old: https://github.com/NeuroBench/neurobench/blob/dev/neurobench/datasets/primate_reaching.py#L113 label = self.labels[:, idx]

New: https://github.com/NeuroBench/neurobench/blob/dev_phueber/neurobench/datasets/primate_reaching.py#L118 return self.samples[:, mask].transpose(0, 1), self.labels[:, mask].transpose(0, 1)

The benchmark.py with ANN follows the old label format, and the new benchmark_SNN.py works with the new label format.

How should we make this consistent?

Also pinging @vinniesun, which label format do your models follow?

dlits21 commented 1 year ago

Hi Jason, That is what i meant with 3D outputs before...

For testing our SNN this does not matter because then I can change the output dimensions accordingly. For training the SNN a timeseries would be important though.

jasonlyik commented 1 year ago

@dlits21 Not sure I completely understand, are you suggesting that we use the multi-step label format because it assists with training?

dlits21 commented 1 year ago

I am saying that for testing our network it does not matter. And I can change it in a couple of minutes.

But maybe other (recurrent) networks need multi step labels (to counter vanishing / exploding gradients)

vinniesun commented 1 year ago

@jasonlyik the data format we use is the new one (data_point, 2).

jasonlyik commented 1 year ago

@dlits21 Ok, would you be able to restore the label format to (data_point, 2) for consistency with the Benchmark? Could the multi-step labeling be implemented in the training pipeline, rather than in the dataset itself?

dlits21 commented 1 year ago

@jasonlyik Done

Well it could be done, but it would require some hacking around

jasonlyik commented 1 year ago

@dlits21 @vinniesun Merging this PR as the dataset format for the primate reaching task is stable.

NeuroBench / neurobench

SNN IMEC #88