Closed paul-bd closed 5 years ago
Hi! how are the different rois labelled in your segmentation array? Just [0,1] or do they have individual labels per lesion? If they are only labelled as foreground (e.g. 1 vs. 0), you need to use the "get_rois_from_seg_flag". (unlike in the lidc data loader, where this flag is set to False, becuase individual lesions already had individual labels)
Thanks for your reactivity!
Indeed my labels are only [0,1].
Modifying all get_rois_from_seg_flag to True in the data_loader get me this error (works normally if set to False) :
Traceback (most recent call last): File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/dataloading/multi_threaded_augmenter.py", line 35, in producer item = transform(**item) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/abstract_transforms.py", line 84, in __call__ data_dict = t(**data_dict) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 230, in __call__ data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/augmentations/utils.py", line 466, in convert_seg_to_bounding_box_coordinates data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands ValueError: cannot copy sequence with size 8 to array axis with dimension 1 Process Process-4: Traceback (most recent call last): File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/dataloading/multi_threaded_augmenter.py", line 35, in producer item = transform(**item) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/abstract_transforms.py", line 84, in __call__ data_dict = t(**data_dict) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 230, in __call__ data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/augmentations/utils.py", line 466, in convert_seg_to_bounding_box_coordinates data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands ValueError: cannot copy sequence with size 2 to array axis with dimension 1 Process Process-5: Traceback (most recent call last): File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/dataloading/multi_threaded_augmenter.py", line 35, in producer item = transform(**item) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/abstract_transforms.py", line 84, in __call__ data_dict = t(**data_dict) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 230, in __call__ data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/augmentations/utils.py", line 466, in convert_seg_to_bounding_box_coordinates data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands ValueError: cannot copy sequence with size 3 to array axis with dimension 1 Process Process-2: Traceback (most recent call last): File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/dataloading/multi_threaded_augmenter.py", line 35, in producer item = transform(**item) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/abstract_transforms.py", line 84, in __call__ data_dict = t(**data_dict) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 230, in __call__ data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/augmentations/utils.py", line 466, in convert_seg_to_bounding_box_coordinates data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands ValueError: cannot copy sequence with size 31 to array axis with dimension 1 Process Process-3: Traceback (most recent call last): File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/dataloading/multi_threaded_augmenter.py", line 35, in producer item = transform(**item) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/abstract_transforms.py", line 84, in __call__ data_dict = t(**data_dict) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 230, in __call__ data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/augmentations/utils.py", line 466, in convert_seg_to_bounding_box_coordinates data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands ValueError: cannot copy sequence with size 27 to array axis with dimension 1 Process Process-6: Traceback (most recent call last): File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/paulbd/anaconda3/envs/MDK/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/dataloading/multi_threaded_augmenter.py", line 35, in producer item = transform(**item) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/abstract_transforms.py", line 84, in __call__ data_dict = t(**data_dict) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 230, in __call__ data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag) File "/home/paulbd/medicaldetectiontoolkit/batchgenerators/batchgenerators/augmentations/utils.py", line 466, in convert_seg_to_bounding_box_coordinates data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands ValueError: cannot copy sequence with size 8 to array axis with dimension 1
Any idea why ?
Best,
Paul
I just figured the docu for this function is missing (will add it very soon). The fact that your lesions are not individually labeled in your segmentation map implies that they also do not have individual class labels. So the function now expects a scalar for data_dict[‘class_target’][b]. This could be a per-patient label [0, ..., n] or if you only have one class in the data set, just put 0. (This fix should be done in your data loader)
Ok thanks! yes indeed, but assignation fails using the provided code, I don't know why, maybe because of how I saved in the dtype of class_target... I nevertheless used a trick that work but is kind of dirty
data_dict_b=dict(data_dict)
del data_dict['class_target']
data_dict['class_target']=[]
out_seg = np.copy(data_dict['seg'])
for b in range(data_dict['seg'].shape[0]):
p_coords_list = []
p_roi_masks_list = []
p_roi_labels_list = []
print('NEW BATCH')
if np.sum(data_dict['seg'][b]!=0) > 0:
if get_rois_from_seg_flag:
clusters, n_cands = lb(data_dict['seg'][b])
val_initiale=data_dict_b['class_target'][b]
val_initiale=list(val_initiale)[0]
data_dict['class_target'].append([val_initiale]* n_cands)
print(data_dict['class_target'][b])
`
it then returns me [0,0,0.....] as expected but it stop after a few batches (before starting 1st epoch)... It would maybe be more convenient preprocess it as in the LIDC and doing this part before? Could you please attach the info_df.pickle, as I will try to make exactly the same hoping that it work (notably the class_target column :) )!
I am also interested in the second part you mentioned (as I indeed only have one class_target, but for convenience and as it failed when I put only one class my class target is a random integer between 1 and 5... Where should that be modified in the data_loader?
Thank you so much for your time,
Paul
Seems to be working when preprocessing it :) and using a info_df.pickle like this :
Thanks Will work on the part with one class, I think it is because of the balanced sampling of patches between classes that their maybe a problem,
best regards
You are still assigning several class labels per patient? why not make elements in the class_target column in the info_df a scalar instead of a list?
Or alternatively changing line 235 in the lidc data loader to :
batch_targets.append(0)
I would not recommend to change the function in the batch generators.
Also if you only have one class you should not use the "get_class_balanced_patients" function in line 225, but just draw random samples from all patients in your training data for batch generation.
Indeed, thanks, working great when putting it as a scalar and taking random samples. (training in progess :) ). Best regards, and happy new year!
Hi @paul-bd and @pfjaeger , I also only have data labelled as foreground (e.g. 1 vs. 0) and I am getting the same error from convert_seg_to_bounding_box_coordinates
...
File "/content/gdrive/My Drive/Dissertation/medicaldetectiontoolkit-master (1)/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 229, in __call__
data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag)
File "/content/gdrive/My Drive/Dissertation/medicaldetectiontoolkit-master (1)/batchgenerators/batchgenerators/augmentations/utils.py", line 518, in convert_seg_to_bounding_box_coordinates
data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands
ValueError: cannot copy sequence with size 12 to array axis with dimension 1
...
I followed your advice and set all get_rois_from_seg_flag=True in data_loader.py, I have also set get_rois_from_seg_flag=True in the batchgenerator files: transforms -> utility_transforms.py and augmentations -> utils.py.
And I have set it to just draw random samples from all patients in training data for batch generation (line 225):
...
def generate_train_batch(self):
batch_data, batch_segs, batch_pids, batch_targets, batch_patient_labels = [], [], [], [], []
class_targets_list = [v['class_target'] for (k, v) in self._data.items()]
#if self.cf.head_classes > 2:
# samples patients towards equilibrium of foreground classes on a roi-level (after randomly sampling the ratio "batch_sample_slack).
# batch_ixs = dutils.get_class_balanced_patients(
# class_targets_list, self.batch_size, self.cf.head_classes - 1, slack_factor=self.cf.batch_sample_slack)
#else:
batch_ixs = np.random.choice(len(class_targets_list), self.batch_size)
patients = list(self._data.items())
...
When I change line 235 to batch_targets.append(0) I still get ValueError: setting an array element with a sequence:
...
File "/content/gdrive/My Drive/Dissertation/medicaldetectiontoolkit-master/batchgenerators/batchgenerators/transforms/utility_transforms.py", line 229, in __call__
data_dict = convert_seg_to_bounding_box_coordinates(data_dict, self.dim, self.get_rois_from_seg_flag, class_specific_seg_flag=self.class_specific_seg_flag)
File "/content/gdrive/My Drive/Dissertation/medicaldetectiontoolkit-master/batchgenerators/batchgenerators/augmentations/utils.py", line 518, in convert_seg_to_bounding_box_coordinates
data_dict['class_target'][b] = [data_dict['class_target'][b]] * n_cands
ValueError: setting an array element with a sequence.
...
note, my class targets are set as 0 for each individual pid in data:
...
data = OrderedDict()
for ix, pid in enumerate(pids):
targets = [0]
data[pid] = {'data': imgs[ix], 'seg': segs[ix], 'pid': pid, 'class_target': targets}
return data
...
Would you know why I am still getting this error? thanks in advance.
Or alternatively changing line 235 in the lidc data loader to :
batch_targets.append(0)
And should also change line 313 to:
class_target = batch_targets
Hello!
I am also working with only one class dataset. I have been able to solve this issue for the training set, but not when making inference.
I would be extremely grateful if someone could share their dataset loader for inference.
Hi,
it seems that ConvertSegToBoundingBoxCoordinates makes a single bounding box when multiple ROIs are on the same slice (or volume). The same problem occurs whether self.dim = 2 or 3 in configs file.
Indeed for now I load via np.load 3D arrays using the data_loader of the LIDC dataset.
eg for 1 slice from the pred example in plots:
Is there anything that can be done to avoid this ?
Best regards
Paul