theislab / cellrank

CellRank: dynamics from multi-view single-cell data
https://cellrank.org
BSD 3-Clause "New" or "Revised" License
347 stars 46 forks source link

Inconsistency in CytoTRACE tutorial #783

Closed WeilerP closed 2 years ago

WeilerP commented 2 years ago

Description

The Schur decomposition in the CytoTRACE tutorial here highlights an eigengap after 5 eigenvalues.

image

The following text, however, talks about an eigengap after 3 eigenvalues which is supposedly shown in the figure. Am I missing something or do we need to update this @Marius1311, @michalk8?

michalk8 commented 2 years ago

This might've happened when the notebook was automatically regenerated (the better case) or we changed the defaults and this caused the issue.

Marius1311 commented 2 years ago

I used to get the plot below:

Screenshot 2021-12-07 at 18 37 50
Marius1311 commented 2 years ago

The random walks also used to look slighly different - which change could have caused this @michalk8?

Marius1311 commented 2 years ago

In line 10, we fix most of the relevant parameters: ctk.compute_transition_matrix(threshold_scheme="soft", nu=0.5)

Marius1311 commented 2 years ago

This cannot yet be caused by https://github.com/theislab/cellrank/pull/778, right?

michalk8 commented 2 years ago

This cannot yet be caused by #778, right?

No, it cannot.

Marius1311 commented 2 years ago

@WeilerP volunteered to check what caused this change. Thanks @WeilerP!

WeilerP commented 2 years ago

@michalk8, the proposed eigengap switches from 3 to 5 when applying this commit, i.e. after merging #583. I presume it is due to the changes to Pseudotimekernel.compute_transition_matrix. Specifying b=20 indentifies the gap after 4 eigenvalues. What was the reason for removing percentile and density_normalize as function arguments?

michalk8 commented 2 years ago

@michalk8, the proposed eigengap switches from 3 to 5 when applying this commit, i.e. after merging #583. I presume it is due to the changes to Pseudotimekernel.compute_transition_matrix. Specifying b=20 indentifies the gap after 4 eigenvalues. What was the reason for removing percentile and density_normalize as function arguments?

We've had the internal finetuning of these on Zebrafish + other dataset (also linked in the PR, should be visible for you) and it did perform slightly worse.

WeilerP commented 2 years ago

Yes, but why not simply use default values (density_normalize=False, percentile=None) to allow for more flexibility?

Marius1311 commented 2 years ago

@WeilerP, would you mind checking whether this is due to density_normalize, percentile, or both of them? I recall percentile just didn't make much sense to me intuitively, that's why we removed it (also, it didn't give better performance, as @michalk8 stated above).

hyjforesight commented 2 years ago

Hello CellRank, This is also what I'm curious about. If the dash line is at Index '4', how many n_states shall we set for g_fwd.compute_macrostates(n_states=?), 4 or 5? Thanks! Best YJ

Marius1311 commented 2 years ago

In the example below, the eigengap would be inferred aver 5 eigenvalues, so you should compute 5 macrostates, as a starting point. Keep in mind the eigengap statistic is a heuristic though, so take this more as a starting point to your analysis.

image

WeilerP commented 2 years ago

@Marius1311, @michalk8, is there a reason why we start counting the eigenvalues at 0? I always found this rather confusing, TBH.

Marius1311 commented 2 years ago

Not that I recall - happy to change this so that we start counting with 1.