935963004 / LaBraM

ICLR 2024 spotlight
227 stars 35 forks source link

.cnt files do classification tasks #17

Open upper127 opened 6 months ago

upper127 commented 6 months ago

If the dataset is a cnt file, what code needs to be used or written to use the model?I seem to be seeing a couple of preprocessed python scripts:make_h5dataset_for_pretrain.py、dataset.py. Do I just modify the folder path and run it directly, or do I need to write my own code? If I need to write my own code, what do I need to pay attention to? I've seen a lot of people have questions about how to run the model on their own dataset, can you provide some help? Thank you

935963004 commented 6 months ago

You can use make_h5dataset_for_pretrain.py to transform .cnt files into .hdf5 files for pre-training. If you want to run finetuning to perform classification tasks with your own dataset, you are recommended to use the provided finetuning script in README. For your own dataset, you should write preprocessing and dataloader codes by your own, and replace the dataloader part in run_class_finetuning.py with yours.

upper127 commented 6 months ago

Is it necessary to execute modeling_pretrain.py or can I run run_class_finetuning.py directly after writing the preprocessing and data loading code?

935963004 commented 6 months ago

It depends on you. You can load our provided pre-trained checkpoint, and fine-tune it on your own dataset. If you find the performance unsatisfactory, you can also try to pre-train a new model using your own dataset.

upper127 commented 6 months ago

Just remove the useless channels from the cnt file, filter between 0.1 Hz and 75 Hz, trap filter at 50 Hz, resample to 200 Hz, and then divide the training set validation set test set, and change the data loader in the code and then the classification task will be achieved?

935963004 commented 6 months ago

Right

upper127 commented 6 months ago

Is there a format requirement for the data input to the model? Simply put, can I input the cnt file directly into the model, or do I need to do what I did with the TUEV data: convert the edf format to pkl format?Thanks