Error while following the training example

pjsjongsung commented 8 months ago

First of all, thank you for the great work!

I am trying out the training tutorial with the provided labels in this repo, but it is throwing an error below


Traceback (most recent call last):
  File "/N/slate/jp109/SynthSeg-master/eva_synthseg.py", line 81, in <module>
    training(path_training_label_maps,
  File "/N/slate/jp109/SynthSeg-master/SynthSeg/training.py", line 243, in training
    brain_generator = BrainGenerator(labels_dir=labels_dir,
  File "/N/slate/jp109/SynthSeg-master/SynthSeg/brain_generator.py", line 263, in __init__
    self.labels_to_image_model, self.model_output_shape = self._build_labels_to_image_model()
  File "/N/slate/jp109/SynthSeg-master/SynthSeg/brain_generator.py", line 273, in _build_labels_to_image_model
    lab_to_im_model = labels_to_image_model(labels_shape=self.labels_shape,
  File "/N/slate/jp109/SynthSeg-master/SynthSeg/labels_to_image_model.py", line 192, in labels_to_image_model
    image = layers.IntensityAugmentation(clip=300, normalise=True, gamma_std=.5, separate_channels=True)(image)
  File "/N/soft/sles15/deeplearning/Python-3.10.10/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_fileb6ong8zg.py", line 43, in tf__call
    ag__.if_stmt((ag__.ld(self).noise_std > 0) | (ag__.ld(self).gamma_std > 0) | ag__.ld(self).contrast_inversion, if_body_1, else_body_1, get_state_1, set_state_1, ('sample_shape',), 1)
  File "/tmp/__autograph_generated_fileb6ong8zg.py", line 21, in if_body_1
    sample_shape = ag__.converted_call(ag__.ld(tf).concat, ([ag__.ld(batchsize), ag__.converted_call(ag__.ld(tf).ones, ([ag__.ld(self).n_dims],), dict(dtype='int32'), fscope)], 0), None, fscope)
ValueError: Exception encountered when calling layer "intensity_augmentation" (type IntensityAugmentation).

in user code:

    File "/N/slate/jp109/SynthSeg-master/ext/lab2im/layers.py", line 1183, in call  *
        sample_shape = tf.concat([batchsize, tf.ones([self.n_dims], dtype='int32')], 0)

    ValueError: Fill dimensions must be >= 0 for '{{node intensity_augmentation/ones}} = Fill[T=DT_INT32, index_type=DT_INT32](intensity_augmentation/ones/Const, intensity_augmentation/ones/Const_1)' with input shapes: [1], [] and with input tensors computed as partial shapes: input[0] = [?].

Call arguments received by layer "intensity_augmentation" (type IntensityAugmentation):
  • inputs=['tf.Tensor(shape=(None, 128, 128, 128, 1), dtype=float32)']
  • kwargs={'training': 'None'}

I did change the model from unet to something else, but the BrainGenerator class is called before declaration of the model, so I do not think that is the problem.

I also changed the output_shape to 128, so could this be the problem?

I am using python 3.10 with tensorflow 2.12.0 in case this is a version issue.

BBillot commented 8 months ago

Hi, I agree that changing the model shouldn't affect the brainGenerator class. For output_shape, did you only change the value in the training script ? If yes, that shouldn't be a problem either. I recently received some issues concerning new updates of tensorflow, maybe that's the problem ? Can you have a look at this issue and let me know if this is pertinent to you ? #81

pjsjongsung commented 8 months ago

The system is linux so I do not think the would be it would be related to any MAC and its tensorflow issue (which I know is a pain). I do not have full control over our system, but it seems the lowest version of tensorflow we can have is 2.8.0. I'll try this out and get back to you.

BBillot commented 8 months ago

Otherwise you can try the installation command I gave in the readme. Let me know how it goes :)

pjsjongsung commented 8 months ago

We cannot manually install a package that requires GPU usage on our system, so we cannot revert to python 3.6 or 3.8 as suggested in the readme and install a new tensorflow. However, I can confirm that with tensorflow 2.8.0 and python 3.10, I got the same error as above.

BBillot commented 7 months ago

Ok so I was able to replicate this error and fix it with tf 2.12. Unfortunately, switching to tf 2.12 brought more errors, and this time there are indeed due to library compatibility, and I have no solution for this right now, except switching back to python 3.8 and tf 2.2. You can do this by installing miniconda, and using the provided commands in the readme. I understand that it might be difficult in your particular case, but just saying it in case other people are reading this. Sorry Benjamin

pjsjongsung commented 7 months ago

Thank you for checking this issue! Yes, unfortunately I cannot make it work at the moment, but good to know where the error is coming from.

BBillot / SynthSeg

Error while following the training example #83