Closed ktmeaton closed 3 weeks ago
Hey @ktmeaton!
Thanks so much for bringing this up, and sorry for the delay in getting back to you- you caught us in a period of lots of travel and multiple papers wrapping up, and I haven't been able to carve out the time to properly respond.
We definitely have noticed errors like the ones you've mentioned (in part why we added the error message suggesting users try the --depthcutoff
option), but have stuck with ECOS primarily for sake of continuity and interpretability (for the most part the answers it returns are generally in line with expectations), more than anything. That's very interesting that Clarabel is able to solve the problem under all of the tested conditions- I'll test it out over the next few days (will send an example test file as well!) with some example files on our end, I'm curious as to how it breaks ``ties" when multiple lineages have the same effective barcodes provided the available coverage. If it more or less functions the same, but just suffers less numerical instability/converges more reliably, I'm fully in favor of a like-for-like swap.
Will follow up soon! Josh
Thanks again for bringing this up @ktmeaton. I went ahead and did a bunch of testing, including using the data from #237. As you mention, in many of the cases where ECOS fails, Clarabel manages to converge or provide a near-solution (i.e., below the tolerance). The results are identical in most cases, and they behave in the same way when they are unable to distinguish between lineages (without using --depthcutoff
), providing equivalent estimated lineage prevalences for these lineages. I also did some testing of other solvers, including OSQP, which also seemed to outperform ECOS but was a bit slower.
I've just made a new Freyja release that includes Clarabel as the default solver, and includes an option to try the other two. :)
Thank you for the update! I just finished testing the v1.5.1
release, and it seems to fix all these errors. Thanks so much!
As we update our freyja dataset to barcodes released in April and May, we are observing an increasing frequency of errors during
demix
even when using the--depthcutoff
parameter. These errors don't appear when running the same samples on earlier datasets (ex. March). I'm wondering if the new JN sublineages are taking us into strange edge case territory for certain samples/parameters? Perhaps this is also related to issue https://github.com/andersen-lab/Freyja/issues/237?The errors are of the variety
Reached NAN dead end
andRAN OUT OF ITERATIONS
:I've found an assortment of related ECOS issues, that seem to concern numerical instability:
And when re-installing freyja for this issue, I saw the new CVXPY warning of ECOS's deprecation:
The CVXPY maintainers also cite numerical instability in their ECOS deprecation post, and recommend Clarabel as ECOS's replacement. In my initial testing, it seems that Clarabel does not raise these errors and is a pretty simple swap:
Questions
To Reproduce
I'm on the hunt for a publicly available sample that has the same error to share. If I can find one in the CDC's wastewater dataset, I'll upload the variants and depths.
I'm using
freyja=1.5.0
from conda, which now hascvxpy=1.5.1
. For comparative barcodes, I'm using04_02_2024-00-49
(works) and04_24_2024-00-49
(errors).I've tested a range of
--depthcutoff
: (0, 1, 10, 30, 100, 200, 300, 400, 500, 1000). It seems sample-specific which exact cutoff will cause the error with ECOS. But interestingly with Clarabel, all coverages work (no errors so far).