alevax / pyviper

Porting of Protein Activity and Pathway Inference to single cell and Python.
MIT License
4 stars 0 forks source link

Aligning cells in clusters Stouffer function #62

Open alexanderlewis99 opened 2 months ago

alexanderlewis99 commented 2 months ago

From Luca: I noticed a behavior in pyviper.pp.stouffer when the input is a dataframe, which is totally fine, but could perhaps confound the user. When the input is a pd.DataFrame the cluster list could be a named vector, e.g. a series dataframe, coming e.g. from another object. If the clusters vector is not ordered like the cells in the dataframe a mismatch might occur. If the user is careful no problem, if by chance they assume that the function will align each cell to its own cluster (as with the anndata input) they might incur into issues if they haven't previously aligned the two.

One solution could be to include, for the case of the clusters vector being a named vector, something like: sorted_clusters = df["clusters"].reindex(counts.index) where the clusters are reindexed as the matrix

The second one would be simply to warn the user in the documentation (for the dataframe case)

Probably the second one will be enough, but I am afraid it should be stated explictly because if the users see an automatic behavior with the anndata they might be prone to assume the same also for the dataframe case. Let me know your thoughts.