In order to get a good baseline model a few tweaks need to be implemented:
improve optimization by adding grokking training scheme (cosine decaying learning rate scheduler, weight decay) to model_builder.py.
Add loss_function.py where we have a loss_function class that wraps tf.keras.losses and has a class method that let's us set the loss to zero for image regions whose training label is -1 (i.e. not specific enough marker for the celltype)
Relevant background
This just adds a few tweaks that are known to improve training and generalization. It should give us a good baseline to compare future approaches that target the noisy labels problem.
Design overview
Weight decay is a class attribute of tf.keras.layers, so I'll write a utility function that takes in a model, iterates through the layers and adds l2 weight decay on kernels and biases of all layers that contain weights.
LR scheduler : we use the one from deepcell.utils.train_utils.rate_scheduler at the moment. If deepcell has a cosine decay scheduler we'll use this one, if not I'll replace it with tf.keras.optimizers.schedules.CosineDecay in prep_model.py (l64)
I'll replace ModelBuilder.prep_loss with a separate class that wraps tf.keras.losses and zeros out loss regions where the label is -1, it should also allow to use other loss functions from tf.keras.losses such as focal loss
Weight decay requires a factor to control its strengths (i.e. 1e-4)
LR scheduler requires minimal LR and training steps
loss functions use different arguments and take kwargs* so that they are all accessible via this wrapper
Output files
This just improves training but no direct output is created.
Timeline
Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.
[X] A couple days
[ ] A week
[ ] Multiple weeks. For large projects, make sure to agree on a plan that isn't just a single monster PR at the end.
Estimated date when a fully implemented version will be ready for review: tomorrow
Estimated date when the finalized project will be merged in: tomorrow
Instructions
In order to get a good baseline model a few tweaks need to be implemented:
loss_function.py
where we have a loss_function class that wrapstf.keras.losses
and has a class method that let's us set the loss to zero for image regions whose training label is -1 (i.e. not specific enough marker for the celltype)Relevant background
This just adds a few tweaks that are known to improve training and generalization. It should give us a good baseline to compare future approaches that target the noisy labels problem.
Design overview
deepcell.utils.train_utils.rate_scheduler
at the moment. If deepcell has a cosine decay scheduler we'll use this one, if not I'll replace it withtf.keras.optimizers.schedules.CosineDecay
inprep_model.py
(l64)ModelBuilder.prep_loss
with a separate class that wrapstf.keras.losses
and zeros out loss regions where the label is -1, it should also allow to use other loss functions fromtf.keras.losses
such as focal lossCode mockup
Required inputs
Output files
This just improves training but no direct output is created.
Timeline Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.
Estimated date when a fully implemented version will be ready for review: tomorrow
Estimated date when the finalized project will be merged in: tomorrow