ShobiStassen / PARC

MIT License
41 stars 11 forks source link

Python igraph #18

Closed luglilab closed 2 years ago

luglilab commented 2 years ago

Hi,

in the installation tutorial I try to execute this command:

pip install python-igraph, leidenalg==0.7.0, hnswlib, umap-learn

but i noted that python-igraph now is "pip install igraph"

Could you specify the python version to use for correct installation of PARC ?

Best regard

ShobiStassen commented 2 years ago

Hi! thanks for mentioning this. I updated the setup.py file to reflect the update in the pip install igraph command and parc works smoothly. pip install python-igraph will work until Sept 2022 let me know if you have any other questions S

luglilab commented 2 years ago

Hi Shobi,

Thanks you for the update, I have another question for you, which is the best method to visualise the graph generated from PARC?

Best regard,

Simone

ShobiStassen commented 2 years ago

hi Simone, i would almost suggest that you consider running your data with pyVIA (the VIA package also on my github) which has lots of graph based visualizations. even though pyVIA is for trajectory, you can always set an arbitrary start cell and minimize the arrow sizes. if your dataset is very big, then viewing as a clustergraph (as done in pyVIA) where each node in the via clustergraph is a PARC cluster, will be easier to interpret that plotting a massive single-cell graph with potentially millions of edges.

plotting a single-cell knngraph could be very slow depending on your sample size. How many cells (n_samples) are you dealing with? I did have some old functions that plotted the underlying parc graph, but for <10,000 cells typically. I can dig these out if you wanted.

luglilab commented 2 years ago

Dear Shobi,

thank you, I did not know pyVIA, I will take a look and read the manual.

We mainly work with data coming from high dimensional cytometry, I do not have a lot of cells (i have generated a small test dataset), but ideally our goal is to fix/understand which is the best setup for our type of data.

PARC performed very well, for this reason we included in our pipeline but as you mentioned the tuning of Parameter impact a lot on the analysis results so I was thinking that could be useful to take a look the graph generated and make a comparison between graphs generated with different parameters.