Parameter tuning for the hybrid solver

Marius1311 commented 1 year ago

HI @mattjones315, I'm struggling a bit with thing hyperparameters for the hybrid solver in Cassiopeia and I was wondering whether you would have time at some point to discuss this via VC. For example, I've found on my data that small changes to lca_cutoff can cause very large differences to the number of subproblems being solved, and I'm not sure how to interpret this. Thanks in advance!

mattjones315 commented 1 year ago

Hi @Marius1311 ,

Thanks for posting this question. I'd be happy to set up a VC to discuss this, but perhaps we can try to arrive at a conclusion here for posterity.

Can you specify if you see non-monotonic behavior with changing lca_cutoff? If this is the case, I would expect there to be a bug that you've encountered.

Otherwise, it's possible you would see large jumps in the number of subproblems for decreasing lca_cutoff, depending on the nuances of the the dataset. For example, there might be a threshold effect where most lineages converge before a certain point (let's say t0); thus, if you were to use an lca_cutoff < t0 you could possibly see a massive jump in the number of subploblems than if you were to use an lca_cutoff > t0.

Personally, I prefer the use of the lca_cutoff parameter as it more faithfully represents the difficulty of a subproblem, and I normally use an lca_cutoff between 12 and 20.

I hope this is helpful, and please let me know if you have further questions.

mattjones315 commented 10 months ago

Closing this due to inactivity; please feel free to re-open if you have any other questions. And hope your analysis is going well!

YosefLab / Cassiopeia

Parameter tuning for the hybrid solver #224