OHBA-analysis / osl-dynamics

Methods for studying dynamic functional brain activity in neuroimaging data.
https://osl-dynamics.readthedocs.io/
Other
62 stars 18 forks source link

errors when running random_state_time_course_initialization #262

Closed suhnyoungjun closed 3 weeks ago

suhnyoungjun commented 5 months ago

Hi, I have three datasets. The first dataset processed successfully without any issues. However, I encountered different errors for the second and third datasets when running random_state_time_course_initialization. Please find the attached images to see the raised errors.

Thank you for your time and help in advance.

Screenshot 2024-06-25 at 10 00 05 PM Screenshot 2024-06-25 at 10 00 25 PM

cgohil8 commented 5 months ago

There's an issue with the dataset, seems like it can't iterate over it. Could be a hyperparameter issue with sequences being to long/short where you end up with only 1 batch, or something like that.

Can you copy and paste the script/code you're using to load the data and train the model.

suhnyoungjun commented 4 months ago

Thanks a lot for your time to look into the issue in advance!

'data_alltasks', 'data_tasks', 'data_all' are loaded successfully and hmm models were also built successfully for all of them. However, the _random_state_time_courseinitialization and fit function didnt work for hmm_tasks and hmm_all.

Bests

Screenshot 2024-07-09 at 10 20 21 AM Screenshot 2024-07-09 at 10 20 33 AM Screenshot 2024-07-09 at 10 20 44 AM Screenshot 2024-07-09 at 10 20 57 AM Screenshot 2024-07-09 at 10 21 06 AM Screenshot 2024-07-09 at 10 21 12 AM Screenshot 2024-07-09 at 10 21 24 AM Screenshot 2024-07-09 at 10 21 45 AM Screenshot 2024-07-09 at 10 21 57 AM
cgohil8 commented 4 months ago

I think your issue might be that your sequence length is such that you have less total sequences than the batch size. What are you using for the sequence_length and batch_size and which is the typical length of the data for each file?

Do you get the same problem if you pass your Data object to model.fit() (i.e. skip the initialization)?

suhnyoungjun commented 4 months ago

yes, I do get the same problem when I pass it to model.fit directly. I am using sequence_length = 1200 and batch_size = 8. The length of my data varies quite a lot; they ranges from 176 timepoints to 1200 timepoints.

cgohil8 commented 4 months ago

I think you might need to use sequence length less than the shortest time series. Maybe try sequence_length=175.

suhnyoungjun commented 4 months ago

Thanks for your advice! I had the sequence length to be task-specific (data_tasks[session][task_type].n_samples//data_tasks[session][task_type].n_sessions), but it didn't work out. I tried different batch_sizes as well, but they also didn't work..

image
cgohil8 commented 4 months ago

You can create a dataset directly and check you're able to iterate through them:

data = Data(..., use_tfrecord=True)
dataset = data.tfrecord_dataset(sequence_length=.., batch_size=..)
for i, ds in enumerate(dataset):
    print(i, ds)

Make sure you choose sequence_length, batch_size to something that gives you a dataset you're able to iterate over.

scho97 commented 1 month ago

@suhnyoungjun Could you let us know whether your errors are resolved?

suhnyoungjun commented 1 month ago

Unfortunately, the issue persisted despite trying various batch_size and sequence_length options. I decided to revert to the MATLAB version of the toolbox, which has been working well for me.

scho97 commented 1 month ago

Sorry to hear that. I'm glad the MATLAB version is working for you. If you're happy with using the MATLAB-based toolbox, I'll close this issue for now. However, if you want further help on the python version, please let me know.

Just in case, you said you have set the sequence length to (data_tasks[session][task_type].n_samples//data_tasks[session][task_type].n_sessions), but you want your sequence length to satisfy samples / (sequence_length * batch_size) > 1.

If the minimum length of your total data samples is 176, this means that, with batch size of 8, you would need sequence length of less than 22 samples (maybe try 10 samples).

scho97 commented 3 weeks ago

Closing this issue due to inactivity. For users encountering similar challenges, please ensure your osl-dynamics package is up-to-date. The recent update, implemented in pull request #297, includes improvements in validating batching processes based on user inputs.

@suhnyoungjun, please feel free to reopen this issue if further questions arise. Thank you.