Input data generation - Githubissues

shiblyg commented 10 months ago

Hi Ivan,

In the paper you mentioned the input tumors are generated by randomly sampling the tumor model parameters.

My question is : does all the randomly sampled tumors are generated from one single simulated tumor, one threshold file and one parameter_tag2.pkl file. or all the tumors generated separately? And could you please provide the input data structure: dataset/ ... ... threshold file/ ... ... Thank you

shiblyg commented 10 months ago

Hi Ivan, I hope you are doing well, Could you please briefly explain about the 100, 000 data generation process. Do you simulate each tumor data separately. Thank you

IvanEz commented 10 months ago

Hi @shiblyg, yes, we simulate each tumor separately. Hope it helps

shiblyg commented 10 months ago

Hi Ivan, Thank you for your reply. Please let me know the following data structure is correct: ./Dataset Data_0/ Data_0001.npz parameter_tag2.pkl ............... Data_99999/ Data_0001.npz parameter_tag2.pkl ./Thresholds scanthresholds0.npz scanthresholds1.npz ............... scanthresholds99999.npz

So for each simulated tumor have a parameter pkl file and corresponding threshold file Thank you

IvanEz commented 10 months ago

Seems correct

shiblyg commented 10 months ago

Hi Ivan, for threshold.npz file is it the same file copied for 100,000 times during training? scanthresholds0.npz scanthresholds1.npz ............... scanthresholds99999.npz all are the same file with ordering Thank you

IvanEz commented 10 months ago

no, these are different threshold values, the values used to threshold simulated tumors. One has to select the values such that the thresholded simulated tumor volume is within the plausible size (based on minimum and maximum tumor sizes of real tumors from BraTS dataset)

shiblyg commented 10 months ago

Hi, I found this one to generate threshold files. import numpy as np np.random.seed(1234242)

t1gd = 0.35 np.random.rand(12000) + 0.5 flair = 0.45 np.random.rand(12000) + 0.05 necrotic = 0.05 * np.random.rand(12000) + 0.95

np.savez_compressed("scanthresholds", t1gd=t1gd, flair=flair, necrotic=necrotic) Will it be ok for generating threshold files using this part of the code for training dataset too. Bu how could I generate the separate threhold files for each simulated tumor? Could it be possible to share the thresholding code and parameter code for the tumor generation.

Thank you

shiblyg commented 10 months ago

Hi Ivan, Is discarding the simulated tumor based on thresholds done in the dataloader.py: thrsholds1 thrsholds2

IvanEz commented 10 months ago

Essentially, one has to check if the thresholded volume is within the limits of typical tumor size (i.e. 0.22% and 22.44% of the brain volume. As mentioned in the paper, these limits we computed by analyzing the BraTS dataset).

IvanEz / learn-morph-infer

Input data generation #5