ankilab / DeepD3

Apache License 2.0
19 stars 6 forks source link

Problems when training with 2D dataset #7

Closed dcupolillo closed 5 months ago

dcupolillo commented 5 months ago

Description

I am attempting to train a model using my own 2D dataset with DeepD3, but the training process hangs when following instructions provided in Training DeepD3 model.ipynb, and it seems like it is not engaging the GPU as expected.

Images are signed int16 grayscale tiff images, dendritic and spines maks are binary tiff images, as shown in the example images below:

4_3_1 4_3_1_dendrite 4_3_1_spines

The creation of training data using the deepd3-training GUI and the generation of .d3data files proceed without issues. The generated .d3data is correctly displayed as shown in the screenshot below:

Screenshot3

Steps to Reproduce issue

  1. Prepare a 2D dataset with images in signed int16 format. Dendritic and spine masks are provided as 2D binary TIFF images (generated with imageJ Fiji).
  2. Create training data (.d3data files) using the deepd3-training GUI. Place bounding box, enter pixel size, resolution in microns and z step = 0.
  3. Arrange training data in a training.d3set and validation.d3set files
  4. Follow the instructions in Training DeepD3 model.ipynb Jupyter notebook (running in Anaconda Spyder) up to the m.fit function call.
  5. Training process hangs at this step without showing advancing epochs and without significant GPU utilization.

Expected Behavior

Actual Behavior

Environment

Additional Context

Tasks

anki-xyz commented 5 months ago

Question: Is your code working the "our" d3set files that we provide? For reproducibility reasons: Can you provide me some example d3set files?

dcupolillo commented 5 months ago

Yes, the code works with DeepD3_Training.d3set and DeepD3_Validation.d3set. In few seconds I get the epochs running.

Here you can find a training.d3set and a validation.d3set.

Thanks for the help!

anki-xyz commented 5 months ago

Dear Dario,

these were the issues:

Image size

Your data has shape (1, 125, 65), which is very very small. Therefore, you get an infinite loop, when the stream would like to generate (1,128,128) images.

Training size

I reduced the input to the training to (1, 32, 32). That means, you need to adjust the U-Net as well as the DataGenerators. It is now very small in the bottleneck (1, 2, 2) which may cause problems. You may need to adjust this.

Dtype

int16 as dtype is also problematic, please use uint16. I fixed this by just shifting the data by the minimum.

With these changes, I could make the DeepD3 neural net run and it seems to converge. Results are not tested. If you use a fixed size for training, please remember that inference needs a flexible size (see hints on the DeepD3 website). Training_DeepD3_model.zip