Closed JLrumberger closed 2 years ago
Looks good! Couple questions: 1) For loss masking, are you saying we'd take the bottom 5% CE as clean examples, or the bottom 95% CE as clean examples? 2) What is the difference between confidence and CE? 3) Is there a schedule for how the thresholds for confidence and CE loss change over time? 4) I think for the first version, having only two classes (positive and negative) makes sense. However, we should check and see what the distribution of confidence/loss looks like across different markers. Given that some markers are harder than others, and some markers have more errors than others, there will definitely be differences in the balance of the cleaned training dataset. Based on how skewed this is, we can decide whether we need to make any adjustments to how clean examples are sampled, for example having minimums per channel, not just per class, balancing, etc.
It looks like doing the loss selection on the cell-level instead of the pixel-level will slow down training considerably. I need to do more tests here and probably will implement both versions, pixel-level and cell-level selection.
Relevant background
We assume that 5-15% of the marker positive/negative labels in our dataset are wrong. To cope with this, we will adapt the clean sample selection procedure from ProMix from the image-level to the cell-level.
Design overview
We adapt the methods presented as ProMix Naive for the binary pixel-wise classification case as follows:
Class-wise small-loss selection:
Matched High-Confidence Selection
The loss and confidence scores used in the above calculations are based on un-augmented training data
Consistency Regularization
Mixup
Code mockup
For 1. and 2. we'll sub-class
ModelBuilder
and change class functionsprep_data
: exclude mapping the data augmentation onto the train datasetprep_model
: excludemodel.compile
from heretrain
: write custom training loop that looks like the one belowadd class functions
class_wise_loss_selection(y_pred, y_gt)
: calculates k-th percentile of GT positive and negative CE loss per class and stores them via exponential-moving-average in a class attribute, returns a loss mask based on thismatched_high_confidence_selection(y_pred, y_gt)
: returns a loss mask for all predictions whose confidence are below a certain threshold tauCode inside the training loop should be put into a static class function decorated by
@tf.function
in order to use static-graph mode during training. Probably we need to change the augmentation library we're using, this could be a good time to switch to one of the tf-based augmentation libraries.Required inputs
Provided a description of the required inputs for this project, including column names for dfs, dimensions for image data, prompts for user input, directory structure for loading data, etc
Output files
Provide a description of the outputs for this project. If any plots will be generated, provide (simple) sketches demonstrating the plot type and axes labels.
Timeline Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.
Estimated date when a fully implemented version will be ready for review:
Estimated date when the finalized project will be merged in: