Open chnlyi opened 2 years ago
@chnlyi Thank you for your interest in the repo.
The code was prepared for the DSB2018 competition where $TEST1
was the stage1 test set we used as additional training data in the 2nd stage. When training on your own data, you can leave it as originally in the repo.
The real test data goes in the $IMAGES_DIR
, these will be used for style transfer learning and the trained model will predict segmentation on them.
You don't need to use $ORIGINAL_DATA
.
The train data was separated to U-Net ($TRAIN_UNET
) and Mask R-CNN ($TRAIN_MASKRCNN
) in case you want to train on different images. You can use the same data in both places.
The convenience script start_training.sh
is suggested to be used for running (it calls run_workflow_trainOnly.sh
), where you can set the test image folder location for the variable IMAGES_DIR
, these will be copied to the workflow folder kaggle_workflow/outputs/images/
and used in the pipeline.
Let me know if this helps.
@spreka Thank you so much for the prompt response. Very helpful!
I understand that the real test data goes in the $IMAGES_DIR, this is for prediction.
What I am trying is using my own annotated images to train the models from scratch. I have my train, validation, and test splits.
@chnlyi The presegmentation result is also used in the the pipeline for preparing masks in style transfer learning&prediction, hence $IMAGES_DIR
is only for test images, e.g. unlabelled images from the experiment.
Should I copy my train into $TRAIN_UNET and $TRAIN_MASKRCNN? What about my validation data? (I experimented a few times, but it seems that I have to runGenerateValidationCustom.sh using my validation data and copy both train and validation data into $TRAIN_UNET and $TRAIN_MASKRCNN.)
Yes, copy the train images there. Validation goes in the $VALIDATION
folder which will be in kaggle_workflow/outputs/validation
by default. Indeed you need to run runGenerateValidationCustom.sh /path/to/validationfolder
, the $IMAGES_DIR
variable in this script is only used to list the validation images, sorry for the confusing variable names.
I have read the codes "run_workflow_trainOnly.sh" and "start_training.sh".
I am confused about $IMAGES_DIR, $ORIGINAL_DATA, $TEST1, $TRAIN_UNET, $TRAIN_MASKRCNN.
After many trials and errors, I am able to run "start_training.sh" without error when I did the following:
However, I am not sure this is what I should be doing.
Why do I need $IMAGES_DIR?
I am using your original $CLUSTER_CONFIG. Is that right?