Open Siddhesh6344 opened 9 months ago
Hi @Siddhesh6344 I am also trying to reproduce the results of the paper, but am having difficulties setting up the pretraining.. if you can help me figure out how to pretrain, I can try the model on a 80GB GPU and tell you how much memory and time it takes!
I was able to reproduce pretraining. I can check how long it takes if you want by generating an empty training dataset
Hi, @dyhan316 , could you please offer me the preprocessed data of .npy format. Thank you very much!
it was actually very simple! you just need a tsv file and npy files.
actually I"ll just provide the code
module load conda
conda activate DIVER
cd /global/homes/d/dyhan316/ECOG_AI/BrainBERT
path_to_run_base=/global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/pretrain_manifests what_to_use=all_channel_all_data
path_to_run=$path_to_run_base/$what_to_use echo $path_to_run
resume_from_ckpt=/global/homes/d/dyhan316/ECOG_AI/BrainBERT/outputs/2024-09-15/all_channel_alldata/checkpoint_last.pth
python3 run_train.py +exp=spec2vec_continue ++exp.runner.device=cuda ++exp.runner.multi_gpu=True \ ++exp.runner.num_workers=100 +data=masked_spec +model=masked_tf_model_large \ +data.data=$path_to_run ++data.val_split=0.01 +task=fixed_mask_pretrain.yaml \ +criterion=pretrain_masked_criterion +preprocessor=stft ++data.test_split=0.01 \ ++task.freq_mask_p=0.05 ++task.time_mask_p=0.05 ++exp.runner.total_steps=1000000 \ ++exp.runner.start_from_ckpt=$resume_from_ckpt
2. how to put datasets correctly for pre-training
dyhan316@login20:/global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/pretrain_manifests/all_channel_all_data> ls manifest.tsv dyhan316@login20:/global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/pretrain_manifests/all_channel_all_data> head manifest.tsv /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_34-datanum_11/chunk_2939.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_3-datanum_6/chunk_5999.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_12-datanum_6/chunk_745.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_22-datanum_11/chunk_12122.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_16-datanum_6/chunk_655.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_22-datanum_11/chunk_5557.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_21-datanum_13/chunk_4818.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_24-datanum_6/chunk_614.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_23-datanum_10/chunk_2860.npy 10000 /global/cfs/cdirs/m4750/ECOG_AI/PilotStudy_BrainBERT_pretrain_data/npy_data/chnl_4-datanum_1/chunk_9988.npy 10000
first column is the path of the `npy` file and the second column is the length of the file (10k timepoints in my case)
(don't forget, it's tab separated!)
good luck :)
From Danny
oh regarding the spy data. we used your own ECoG dataset, so I cannot share to you... sorry:( but the dataset is open source so I think you can reverse-engineer it (the author here has shared the preprocessing code I believe)
My team is trying to reproduce the results of this paper. Our system has 24 GB VRAM capacity and I am trying to reproduce the results but I lately found that my system may not be appropriate for the necessary reproducibility of results. I am able to train results on small patches (~3GB when zipped) but I am not able to train it on the entire dataset as I find the training duration required to be extremely high.
I have found that if I get access to the informtion regarding the GPUs that was used for the results of this paper. It would be really helpful if we get all the hardware specifications used for your research.