JDSobek / MedYOLO

A 3D bounding box detection model for medical data.
GNU Affero General Public License v3.0
35 stars 9 forks source link

Dataset prepare #5

Closed Jerry7j closed 9 months ago

Jerry7j commented 9 months ago

Dear JDSobek,

I am preparing my own dataset to be able to run MedYOLO. I would like to ask if it is multi-category data, what should the label format in a txt be like this? 1 0.142 0.308 0.567 0.239 0.436 0.215 2 0.142 0.308 0.567 0.239 0.436 0.215 3 0.142 0.308 0.567 0.239 0.436 0.215 Are these normalized values calculated from the index coordinates of the CT data? I have a rude request, can you provide a program for label segmentation format to txt?

Best regards, Jerry

JDSobek commented 9 months ago

Hello Jerry,

That is the correct format for multi-category data. Yes, the normalized values are calculated from the index coordinates of the CT data.

Here is a script I used for going from multi-category segmentation NIfTIs to txt labels. There are some caveats that keep me from adding it to this repository. My script is not smart, so even if you have 2+ separate objects that are class 1 in your mask, it will only make a single bounding box and it won't filter any spurious pixels. The fill value for the rotation is modality dependent and currently hard-coded for the CTs I was using. Most significantly, it makes assumptions that you have stored and named your files similarly to the files I used... which is almost certainly not true.

Most of the logic outside trainvalsplit should be generic except for that fill value, but finding and correlating the images and masks is something you'll need to reprogram.

Jerry7j commented 9 months ago

Dear JDSobek,

Thanks Reply!

I tried the script and indeed, my seg generated the same kind of data: 0 0.49866310160427807 0.49528301886792453 0.5 0.9973262032085561 0.9905660377358491 0.9826086956521739 1 0.49866310160427807 0.49528301886792453 0.5 0.9973262032085561 0.9905660377358491 0.9826086956521739 2 0.49866310160427807 0.49528301886792453 0.5 0.9973262032085561 0.9905660377358491 0.9826086956521739 The zxy_length of this type of data almost covers the entire seg. There may be something wrong with it.

I'm trying to modify the script, can you help me look at the problem with this script, I generated a txt with the labelid of seg being 2, but I think there is some offset. label2

Here is the function I used to generate the data. Do you see anything wrong or could you give me some tips? 2 0.34873188405797106 0.34795321637426896 0.3466135458167331 0.4344746162927981 0.269185360094451

def get_mask_info(nii_file, labelid):

img = nib.load(nii_file)
data = img.get_fdata()

mask = np.argwhere(data==labelid)

z_coords = mask[2]
z_center = np.mean(z_coords)
z_length = np.max(z_coords) - np.min(z_coords) + 1

x_coords = mask[0]
x_center = np.mean(x_coords)
x_length = np.max(x_coords) - np.min(x_coords) + 1

y_coords = mask[1]
y_center = np.mean(y_coords)
y_length = np.max(y_coords) - np.min(y_coords) + 1

z_center /= z_length
x_center /= x_length
y_center /= y_length

sum_length = z_length + x_length + y_length
z_length /= sum_length
x_length /= sum_length
y_length /= sum_length

return z_center, x_center, y_center, z_length, x_length, y_length

Yours, Jerry

JDSobek commented 9 months ago

sum_length is trying to do something that doesn't make sense. You need to normalize against the total span of the image in each direction, not the length of the mask in each direction. You also definitely don't want to combine information between the x, y, and/or z directions. Each direction is independent.

See this function in the above linked script or what I do to generate masks from MedYOLO labels.

My first suggestion is to use np.array.shape to get img_z_length so that you can perform z_length /= img_z_length and z_center /= img_z_length. Repeat for x and y.

Fixing that should give you a bounding box with the right general shape, but it might leave you with an offset and/or rotation. If that's the case try flipping x and y as you save the label. Certain NIfTI orientations transpose the two.

Also, if you aren't adding it to your label outside this function, your return call is missing labelid. The example label you gave 2 0.34873188405797106 0.34795321637426896 0.3466135458167331 0.4344746162927981 0.269185360094451 looks like when you added labelid one of the coordinates disappeared. It should have 7 values (including the class) and this only has 6.

Jerry7j commented 9 months ago

Dear JDSobek,

Thank you very much for your patient answer! Now I achieved this result. 1403

Your guidance allows me to continue with subsequent tasks, and I will provide timely feedback if there are any problems later.

Yours, Jerry