Open cbravo93 opened 3 years ago
Another example where they differ:
VSN
Seurat (colored by VSN clusters)
That distinction between Mol A/B not sure where it comes from, Seurat also agrees with pycistopic.
Hi @cbravo93, thanks for the comprehensive report.
This feature will be available in the next release i.e. v0.27.0
(which will be released soon)
Is your feature request related to a problem? Please describe. For the same data set with same filters/cells I get much cleaner results with Seurat than with VSN. I think this is mostly due to the selection of variable features. I am not using anything fancy (e.g. sctransform), but seurat uses as default vst (I am using top 3000 features); while VSN still uses the mean_dispersion. VST is implemented in scanpy (https://github.com/theislab/scanpy/issues/993). Here (https://github.com/vib-singlecell-nf/vsn-pipelines/blob/master/src/scanpy/bin/feature_selection/sc_find_variable_genes.py) I can see a method parameter, but nothing is implemented apart from mean_disp. I attach UMAPs for comparison, I can add annotations if it would make it clearer (for VSN nPC was determined with pcacv to be 17; for seurat I just used 30 for a quick check, reducing to 17 does not change results a lot either, or increasing VSN to 30 either).
Describe the solution you'd like Would it be possible to add other methods? I think what I am looking for is flavor='seurat_v3' (https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html)
Describe alternatives you've considered I can just run Seurat instead, but I really like VSN (although now I am unsure if this could be happening in other data sets too)