Add batch-effects normalization (BEN) training and inference scheme

JLrumberger commented 1 year ago

Instructions

Implement the batch-effects normalization (BEN) training and inference scheme from Lin & Lu (2022). This involves three steps:

Batch-wise Training: During training, each training batch exclusively contains samples drawn from a single experimental batch (as opposed to randomly from the entire dataset).
Batch-wise Inference: During inference, predictions are obtained by feeding data from an entire experimental batch to the model at once (as opposed to feeding each data point one-by-one).
Unfrozen Batch Normalization: During inference, the BN layers should not be frozen (i.e. instead of using historical statistics from the training dataset, the BN statistics should be re-calculated based on the inference batch.)

Relevant background We sometimes observe some inconsistencies in the predictions, especially for data that the model was not trained on (below is an example from Inna and Hadeeshas DCIS dataset). It might be due to the domain gap or due to normalization issues, thus, I'd like to try batch effect correction.

Design overview

I need to add a function make_pure_batches to the ModelBuilder class that makes sure that batches only contain samples from one FOV of one dataset and one marker.
During inference, batchnorm layers need to be in train mode and the momentum term needs to be set to zero to normalize the whole batch only with its own running statistics. Thus I need to subclass tf.keras.layers.BatchNormalization to make the call method always behave as in train mode. In addition I'd like to add a function that copys all variables from an instance of tf.keras.layers.BatchNormalization to make it really easy to replace the default BN layers with this new one.

Timeline Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.

[x] A couple days
[ ] A week
[ ] Multiple weeks. For large projects, make sure to agree on a plan that isn't just a single monster PR at the end.

Estimated date when a fully implemented version will be ready for review:

Estimated date when the finalized project will be merged in:

ngreenwald commented 1 year ago

Sounds good to me. It's nice that it doesn't require any knowledge of what the batch is at inference time, just that the whole batch is processed. This is pretty easy, since it's how people will give us the data anyway.

What is the best way to test if this makes a difference? Just run the full dataset, then look at the per-marker, per-dataset F1 scores and check if there's a difference?

JLrumberger commented 1 year ago

I think the F1 scores and also the qualitative appearance of predictions that contained inconsistencies will be enough to determine if it made a difference.

angelolab / Nimbus

Add batch-effects normalization (BEN) training and inference scheme #63

Instructions