broadinstitute / CellBender

CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
https://cellbender.rtfd.io
BSD 3-Clause "New" or "Revised" License
271 stars 50 forks source link

Should I keep decreasing the learning rate? #360

Open txemaheredia opened 1 month ago

txemaheredia commented 1 month ago

Hi,

I ran CellBender on my dataset using these params:

cellbender remove-background \
                 --cuda \
                 --input ${sample}/raw_feature_bc_matrix.h5 \
                 --output ${sample}/cellbender_output.h5 \
                 --expected-cells 10000 \
                 --total-droplets-included 30000 \
                 --exclude-feature-types "Antibody Capture" \
                 --fpr 0.01 \
                 --epochs 150

The html report showed this ELBO plot:

cb_elbo_1

And it issued this warning:

Automated assessment --------

  • WARNING: The training ELBO deviates quite a bit from the max value during the second half of training.
  • We typically expect to see the training ELBO increase almost monotonically. This curve seems to have a concerted period of motion in the wrong direction near epoch 56. If this is early in training, this is probably okay.
  • We hope to see the test ELBO follow the training ELBO, increasing almost monotonically (though there will be deviations, and that is expected). There may be a large gap, and that is okay. However, this curve ends with a low test ELBO compared to the max test ELBO value during training. The output could be suboptimal.

Summary:

This is unusual behavior, and a reduced --learning-rate is indicated. Re-run with half the current learning rate and compare the results.

I followed the suggestion and halved the learning rate, re-running cellbender with --learning-rate 0.00005. The resulting ELBO plot looked like this:

cb_elbo_2

Automated assessment --------

  • The training ELBO deviates quite a bit from the max value at the last epoch.
  • We typically expect to see the training ELBO increase almost monotonically. This curve seems to have a concerted period of motion in the wrong direction near epoch 76. If this is early in training, this is probably okay.
  • We hope to see the test ELBO follow the training ELBO, increasing almost monotonically (though there will be deviations, and that is expected). There may be a large gap, and that is okay. However, this curve ends with a low test ELBO compared to the max test ELBO value during training. The output could be suboptimal.

Summary:

This is slightly unusual behavior, and a reduced --learning-rate might be indicated. Consider re-running with half the current learning rate to compare the results.

Should I keep halving the learning rate? Or I'd be better using the default learning rate but using more epochs?