Closed athulnair02 closed 11 months ago
Hi @athulnair02, thanks for diving into this and fixing it.
I would like to keep BACKGROUND_MASK
around, since it makes sure that with a high probability (0.9, just copied the default from config_training.yaml
into TrainingBaselineWithContext
), we only sample blocks that actually do contain foreground, i.e. some labeled object. This allows us to also add some data blocks that contain only background, but only rarely samples from these blocks during training.
Is it enough to fix AddMask
? BACKGROUND_MASK
has the same size as the LABELS
since it is copied, so it should not cause any additional problems/errors from the ones arising from too small labels.
It would be great to have a minimal example with a single synthetic block, with adjustable size, to understand the failure case in detail.
Hi @bentaculum, upon further review, it seems like there is still an issue with BACKGROUND_MASK
. I was previously running my tests with a (180, 180, 210)
region which works until around 7-11k iterations so I mistakenly assumed some tests worked as soon as the iterations began. Now I have correctly started testing with a (110, 110, 110)
region which shows errors immediately before an iteration of training can run. BACKGROUND_MASK
has an issue in RandomLocationBounded
, but unlike before where I omitted the ArrayKey, I will try finding a solution to keep it. As of now, it seems like MASK
and BACKGROUND_MASK
are the same upstream of RandomLocationBounded
but not downstream.
To conclude the discussion, it seems like AddMask
was not the core issue that introduced the problem, but rather the lack of a node in the pipeline to pad the array BACKGROUND_MASK
. Below, as the pipeline was originally, you can see that after PadDownstreamOfRandomLocation
, the request going upstream to prepare the nodes for the batch has the same region for the arrays LABELS
, MASK
, and METRIC_MASK
. RAW
is meant to be different as a result of augmentations, but BACKGOUND_MASK
should be the same as the rest but is not.
This presents an issue in RandomLocationBounded
as the node is not able to satisfy batch requests related to BACKGROUND_MASK since it is unable to find a location that covers all requested ROIs.
After adding PadDownstreamofRandomLocation
for BACKGROUND_MASK
, the array is the same as LABELS
, MASK
, and METRIC_MASK
allowing for a location to be found that covers all requested ROIs.
Now the pipeline works for regions smaller than (204, 204, 204)
.
Thanks for fixing this, and for the nicely documented PR!
In a training data configuration file, if the shape of a training region is smaller than
[204, 204, 204]
, then there are a few errors within the gunpowder nodes within the pipeline before training even begins. These errors pertain to how the pipeline is built/setup.Some of the errors seen:
BACKGROUND_MASK
since it is unable to find a location that covers all requested ROIs.LABELS
is not in the batch to be processed inAddMasks
MergeLabels
isuint8
when the batch to be processed isuint32
However, in the private repository
fiborganellesegmentation (fos)
, there are a few differences withincasem
that could explain this behavior.fos
does not use the ArrayKeyBACKGROUND_MASK
for its pipeline at all and it is commented out in multiple sections ofincasem.pipeline.training_baseline_with_context.py
.In
add_mask.py
infos
, theprepare
function exists unlike inincasem
. This fails to informAddMask
's upstream provider that it requires the array forLABELS
as a dependency.In both
fos
andincasem
theMergeLabels
node in gunpowder, the default dtype isuint32
, but the pipeline expects auint8
so the final change made was to change the default dtype.The 3 main issues are what prevented the
incasem
pipeline from processing ROIs smaller than(204, 204, 204)
like how the clathrin-coated pits and nuclear pores were trained infos
.