broadinstitute / wot

A software package for analyzing snapshots of developmental processes
https://broadinstitute.github.io/wot/
BSD 3-Clause "New" or "Revised" License
136 stars 34 forks source link

Overflow warnings #90

Open ShouWenWang opened 3 years ago

ShouWenWang commented 3 years ago

Hi, I often get a warning message while using the duality_gap method in WOT: 'Overflow encountered in duality gap computation, please report this incident'. Do you have any idea how this occurs, and how to resolve this? Thanks!

Specifically, this occurs for the function optimal_transport_duality_gap, located at https://github.com/broadinstitute/wot/blob/master/wot/ot/optimal_transport.py

Alexgr97 commented 3 years ago

Hello,

I've been having this problem with a particular dataset too. Can I ask what reasons you think might be causing it in your data? And have you resolved it yet?

ShouWenWang commented 3 years ago

I think it might be that some distances are very large or small. I have not solved this problem yet.

Alexgr97 commented 3 years ago

My growth rate was derived from actual experimental data. Whereas their minimal growth value was around 0.24, mine had some that were a few orders lower, around 0.1 etc. I found that playing around with the iterations, batch sizes, epsilon and lambdas helped resolve it for me, but only as I already had fair confidence in my growth rates. Hope this helps!

Alexgr97 commented 3 years ago

*I should add I explored what was happening and in calculating the duality gap, I was ending up with new iterations too close to zero, that were then ending up making certain values shoot to inf, and then blew everything in subsequent calculations to nans. Their source is well documented, so it's nice code to instrument to find out if this is happening for you too.

Iwo-K commented 2 years ago

We've also encountered this problem in our data. The only solution we found so far, is to exclude 'outlier' cells. We have noticed that there is a small subset of cells with extremely large distances to the rest of the dataset, after excluding them the model ran fine.