analysiscenter / radio

RadIO is a library for data science research of computed tomography imaging
https://analysiscenter.github.io/radio/
Apache License 2.0
222 stars 52 forks source link

sample_nodules() works strange #26

Closed theVmagnificient closed 1 year ago

theVmagnificient commented 5 years ago
from radio.preprocessing.utils import get_nodules_pixel_coords, show_slices
get_nodules_pixel_coords(batch)

returns dataframe with 8 nodules

from radio.preprocessing.utils import num_of_cancerous_pixels
num_of_cancerous_pixels(batch)

returns 1 scan with 2k nodules pixels

However when I'm trying to attachsample_nodules(batch_size=None, nodule_size=(16, 32, 32), share=1.0) to the current pipeline and calling batch_crops = (dicom_dataset >> crops_sampling_pipeline).next_batch(20)

I'm getting nothing here despite I used share = 1.0 get_nodules_pixel_coords(batch_crops).head(1)

akoryagin commented 5 years ago

Hello, @theVmagnificient!

Thanks for the issue! It does look strange. It seems to me, though, that there is some incidental shuffle=True that changes the order of iteration through the dataset. Another possibility is that batch is not the first batch, generated by corresponding pipeline. In any case, it would very helpful if you could provide us with the full gist, that leads to this result.

Best, Alex