Open luglilab opened 2 years ago
hi Simone,
Thanks for your message. Have you seen this tutorial for a mass cytometry dataset, this would help you I think: https://pyvia.readthedocs.io/en/latest/mESC_timeseries.html. This tutorial uses the time-series information in addition to the surface marker expression, but you can just ignore the time-series input labels.
Can I ask what the dimensions of your data are (before PCA etc), (n_cells x n_markers)? Depending on the dimensionality you may or may not opt for PCA (of e.g. top 30 pcs) before running Via. If you have a fairly concise set of meaningful proteins/surface markers then you might be better of avoiding PCA. Typically knn of around 20-30 is good for most datasets unless you have very low cell count. If you have a look at the tutorials for other types of data, you can probably use them as a starting point for parameters and then tune depending on the outcome. I have been meaning to make a Parameter Tuning Tutorial too, it's on my ToDo :) The parameters which have most impact are
@luglilab Just wanted to ask if you were able to use the Readthedocs tutorial? Cheers, Shobi
Dear @ShobiStassen ,
I'm taking a look to your tutorial linked in the above message ignoring the time series.
Before the PCA the dimensions of matrix is usually [row from 10.000 to 1 milion] X [columns < 30 ].
As you suggest I switch from PARC to pyVIA and here https://github.com/luglilab/Cytophenograph/blob/master/PhenoFunctions_v5.py if you could take a look the method "runvia" where I put the executions and the parameters. KNN and Resolution should set by user while others are fixed.
Now I'm doing some test with different dataset of high dimensional cytometry (small - medium - big) to understand if the tuning of parameters could improve the results.
Hi,
I'd like to use pyVIA with flow cytometry data, could you add a tutorial for this kind of data?
Do you have some suggestion about parameters to set?
Thanks you in advance.
Simone