analysiscenter / radio

RadIO is a library for data science research of computed tomography imaging
https://analysiscenter.github.io/radio/
Apache License 2.0
222 stars 52 forks source link

Trouble with the Tutorial3 #27

Closed quaiquai closed 1 year ago

quaiquai commented 5 years ago

I have followed the tutorials line for line. In tutorial 3, when performing split_dump, dumping to the cancerous_folder and non_cancerous_folder, and (luna_dataset.train >> crops_dumping).run(), I get multiple alerts of ' Components ['predictions'] are empty. Nothing is dumped!' in the terminal with no idea how to continue.

akoryagin commented 5 years ago

Hi, @quaiquai!

We'll deal with the issue in the next update of RadIO, that will be released in the upcoming days. Thanks for raising the issue!

P.S. A quick fix for now: find out where the RadIO-package is installed, go to file radio/preprocessing/ct_masked_batch.py and update the line 805 from

nodules = nodules.dump(dst=dst) # pylint: disable=no-value-for-parameter

to

nodules = nodules.dump(dst=dst, components=["images", "masks", "spacing", "origin"]) # pylint: disable=no-value-for-parameter

Best, Alex.

quaiquai commented 5 years ago

Thank you so much for your quick response.

I changed the designated line and fixed part of the problem. Instead of many (Components ['predictions'] are empty. Nothing is dumped!) alerts when running on 3% of dataset I get only 1 or 2. My cancerous_folder is being populated with files, but my non_cancerous_folder is not.

My code below:

import sys
sys.path.append('../')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import radio
from radio.dataset import FilesIndex, Dataset
from radio import CTImagesMaskedBatch as CTIMB

# read annotation
nodules = pd.read_csv('D:\\Users\\mrm27\\Documents\\data\\annotations.csv')

LUNA_MASK = 'D:\\Users\\mrm27\\Documents\\data\\luna\\sub*\\*.mhd'
luna_index = FilesIndex(path=LUNA_MASK, no_ext=True)# preparing indexing structure
luna_dataset = Dataset(index=luna_index, batch_class=CTIMB)

from radio.pipelines import split_dump
cancerous_folder, non_cancerous_folder = 'D:\\Users\\mrm27\\Documents\\data\\lunaset_split\\train\\cancer', 'D:\\Users\\mrm27\\Documents\\data\\lunaset_split\\train\\noncancer'
crops_dumping = split_dump(cancerous_folder, non_cancerous_folder, nodules)
luna_dataset.split([0.03])
print(len(luna_dataset.train))

(luna_dataset.train >> crops_dumping).run()

import os
print(len(os.listdir(cancerous_folder)))
print(len(os.listdir(non_cancerous_folder)))

Also the cancerous_folder is being populated with 4 folders of files: (images,masks,origin,spacing). Which files are needed for combine_crops? The image and mask files are in the form of data.blk and data.shape, is that correct?

Sorry for the many issues.

Thank you.

quaiquai commented 5 years ago

Also, when running

batch_crops = crops_sampling.next_batch()

I get the error:

Traceback (most recent call last): File "C:/Users/mrm27/PycharmProjects/LungCancer/beginning.py", line 41, in batch_crops = crops_sampling.next_batch() File "C:\Python\Python36\lib\site-packages\radio\dataset\dataset\pipeline.py", line 1157, in next_batch batch_res = self.next_batch(*self._lazy_run[0], **self._lazy_run[1]) File "C:\Python\Python36\lib\site-packages\radio\dataset\dataset\pipeline.py", line 1162, in next_batch batch_res = next(self._batch_generator) StopIteration