NeuroBench / neurobench

Benchmark harness and baseline results for the NeuroBench algorithm track.
https://neurobench.readthedocs.io
Apache License 2.0
46 stars 11 forks source link

Nonhuman Primate Reaching Dataset First Reach #210

Closed morenzoe closed 1 month ago

morenzoe commented 1 month ago

Hi, I would like to ask about the NHP Reaching Dataset. I found that the indexes of training segment in ind_train don't start from zero. Could you please explain why did you decide to not use the recording data before the first target coordinate (target_pos) change? Thank you in advance!

Here's my code:

filename = "indy_20160622_01"

# The dataloader and preprocessor has been combined together into a single class
data_dir = "../../../data/primate_reaching/PrimateReachingDataset/" # data in repo root dir
dataset = PrimateReaching(file_path=data_dir, filename=filename,
                        num_steps=1, train_ratio=0.5, bin_width=0.004,
                        biological_delay=0, remove_segments_inactive=False)

dataset.ind_train[0]
jasonlyik commented 1 month ago

@vinniesun This appears to be a question about the split and load of the data, could you help to answer?

vinniesun commented 1 month ago

@morenzoe Thank you for pointing this bug out. It seems when we modified the code related to remove_segments_inactive, this was not adjusted accordingly.

@jasonlyik should I create a new branch to add in the fix? Line 336 in _neurobench/neurobench.datasets/primatereaching.py should be updated from:

start_end = np.array([indices[:-1], indices[1:]])

to:

start_end = np.array([[0]+indices[:-1], [indices[0]]+indices[1:]])
morenzoe commented 1 month ago

@vinniesun Happy to help! Could you please check for the last reach too? In recording indy_20160622_01, the ind_test ends at index 611892, meanwhile the whole dataset ends at index 612420. Thank you in advance!

On the other note, can I use the PrimateReaching() function for the rest of Indy's and Loco's recordings besides the 6 selected recordings you use in the paper?

vinniesun commented 1 month ago

@morenzoe Let me double check that part.

And yes, you can use this function for all the recordings provided on the Zenodo site.

morenzoe commented 1 month ago

@vinniesun This might become a different issue, however I got this error when reading other recordings:

File /opt/conda/lib/python3.10/site-packages/neurobench/datasets/primate_reaching.py:192, in PrimateReaching.download(self)
    190 def download(self):
    191     """Download the Primate Reaching data if it doesn't exist already."""
--> 192     md5 = self.md5s[self.filename]
    194     if self._check_exists(self.file_path, md5):
    195         return

KeyError: 'indy_20160921_01.mat'

The MD5 checksum is optional in the download_url() function in utils.py, however the other recordings filename is not stated in md5s dictionary in primate_reaching.py, which causing the error. Could you lend me a hand on this one too? Thanks!

vinniesun commented 1 month ago

@morenzoe For neurobench it is limited to the six files specified. For the other files, my suggestion is to download them yourself and pass them through the PrimateReaching class with the download argument set to False.

morenzoe commented 1 month ago

@vinniesun Alright, it works as per your instruction. Thank you for your help!

jasonlyik commented 1 month ago

@vinniesun Yup, glad you figured it out. Could you open a PR to fix? And I can update the package ASAP after.

vinniesun commented 1 month ago

@jasonlyik sure thing. I'll have to do it tomorrow though, got church duties tonight!

jasonlyik commented 1 month ago

Fixed by #212, updating main branch and will appear in pip package 1.0.4