vanvalenlab / deepcell-tf

Deep Learning Library for Single Cell Analysis
https://deepcell.readthedocs.io
Other
402 stars 90 forks source link

mesmer regression #600

Open E5ten opened 2 years ago

E5ten commented 2 years ago

Describe the bug Last week, on Tuesday, using the file Composite.tif in the attached archive (I had to attach an archive as github does not support .tif files), I used deepcell.org/predict to generate the masks Composite_feature_[01]_old.tif, which I consider to be accurate. Starting a few days later, the same/equivalent input has led to the malformed mask in Composite_feature_[01]_new.tif, missing many cells. I'm not sure if this is the right place to report issues with the web version of mesmer, but I would guess that the code the website uses has been updated with new deepcell-tf changes, so I reported it here. If another repo is more relevant, I can close this and report it there.

To Reproduce Steps to reproduce the behavior:

  1. Input Composite.tif into deepcell.org/predict
  2. Get an inaccurate mask as output

Expected behavior I'd expect masks similar to the files Composite_feature_[01]_old.tif to be generated, rather than masks such as Composite_feature_[01]_new.tif.

Desktop (please complete the following information):

mesmer_results.zip

ngreenwald commented 2 years ago

Thanks for letting us know. Yes, we recently updated the model, which is what is causing the changes. While we sort it out, you can revert back to a previous version if you use the deepcell-tf repo directly and check out the commit before the Mesmer model was updated

ngreenwald commented 2 years ago

Hey @E5ten, I looked into this a bit more. The data you are running through the model, based on a visual inspection, has a quite different signal to noise ratio compared to most of the training data. The recent update to Mesmer/TissueNet, which included additional cleanup of the training data, produced a model that is less likely to mistake low-level background for true cells in the training data. However, your data is much closer to the background level than the data we trained on. The result is that the model under-predicted true cells in your dataset.

The easiest fix for you would likely be to just use the old weights instead of the new ones. You won't be able to use the web interface to do that, but both the jupyter notebooks in deepcell-tf and the dedicated applications repo have access to all of the past models.

The long-term solution would be annotate images from your dataset and add them to the training data, so the model can better balance the competing demands during training. If you've produced a large dataset that you'd be interested in sharing with us to make that happen, let me know, but that would be a longer timeline in either case.