Model Performance Issues with the IBL Dataset

PPWangyc commented 1 month ago

Thank you for your inspiring work!

I am currently attempting to implement Neuroformer on the International Brain Laboratory (IBL) dataset, but I'm encountering some issues. Our fork with the code we used is available here: GitHub - Neuroformer on IBL.

While the training curve looks promising for neuron ID/time classification, the behavior prediction performance (r²) is disappointingly low. Here is a link to our training run: WandB - Training Run Details.

We followed this procedure for pretraining and finetuning on a specific session of the IBL data:

cd script
source pretrain.sh db4df448-e449-4a6f-a0e7-288711e7a75a
source finetune.sh db4df448-e449-4a6f-a0e7-288711e7a75a

For inference, we used:

source inference.sh db4df448-e449-4a6f-a0e7-288711e7a75a finetune

The performance for both spikes and decoded behaviors remains suboptimal. The Our inference script and evaluation methods are detailed in neuroformer_inference.py and read.py, respectively.

Given that Neuroformer has shown excellent results with calcium imaging data as per the original paper, do you think the model may not be well-suited for the IBL dataset (ephys data / spikes)? I would greatly appreciate any insights or suggestions you might have on this matter.

Thank you for your valuable advice.

a-antoniades commented 1 month ago

Thanks for trying out the model on your data. I hope to be able to help you get better results.

This model was tested on calcium imaging (ophys) data. Is the data you're using ephys?

Looking at your config, I see the following:

window:
  frame: 0.15
  curr: 2
  prev: 2

While the window for behavioral prediction is:

modalities:
  behavior:
    n_layers: 4
    variables:
      # se:
      #   data: se # wheel_speed and whisker_motion_energy
      #   dt: 0.02
      #   objective: null
      #   predict: false
      wheel_speed:
        data: wheel_speed # wheel_speed and whisker_motion_energy
        dt: 0.02
        objective: null
        predict: false
      whisker_energy:
        data: whisker_energy # wheel_speed and whisker_motion_energy
        dt: 0.02
        objective: null
        predict: false
    window: 0.02

A 2 second window_curr seems to large. Notice that in the original model, a window of 0.05 was used. In this way, the sliding window over the stimuli and behaviors can be tightly synced with the neural features of the current state.

Using different current_state and behavior windows probably also leads to misalignment of these features in the dataloader (I haven't tested that before).

Could you give me some feedback on these?

a-antoniades commented 1 month ago

Closing this issue since I haven't received a response. I'm still happy to help get the model working better with your data, so feel free to re-open if you want.

a-antoniades / Neuroformer

Model Performance Issues with the IBL Dataset #8