burtonrj / CytoPy

A data-centric flow/mass cytometry automated analysis framework
https://cytopy.readthedocs.io/en/latest/
Other
38 stars 9 forks source link

Unable to install because of KDEpy version conflict #33

Closed AlexanderWMacFarlaneIV closed 2 years ago

AlexanderWMacFarlaneIV commented 2 years ago

pip install cytopy gives this error: The conflict is caused by: cytopy 2.0.1 depends on KDEpy==1.0.10 cytopy 2.0 depends on KDEpy==1.0.10

pip install kdepy==1.0.10 gives this error: ERROR: Could not find a version that satisfies the requirement kdepy==1.0.10 (from versions: 0.1, 0.2, 0.3, 0.4, 0.5, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.5.5, 0.5.6, 0.6, 0.6.9, 0.6.10, 0.6.11, 1.0.2, 1.0.11, 1.1.0) ERROR: No matching distribution found for kdepy==1.0.10

Installing KDEpy 1.0.11 or 1.1.0 does not work.

jfgonsalves commented 2 years ago

You could try just installing the latest version of KDEPy and then reinstall CytoPy while telling pip to ignore dependencies. It still works (for now) but you may get some warning about CytoPy calling deprecated functions.

I do this on my system, mostly because I want to use ARM64 optimised OSX packages from Conda. Nothing has broken yet but the risk is that things may not work/output may be incorrect (the latter probably unlikely).

Get Outlook for iOShttps://aka.ms/o0ukef


From: AlexanderWMacFarlaneIV @.> Sent: Thursday, July 14, 2022 5:45:19 AM To: burtonrj/CytoPy @.> Cc: Subscribed @.***> Subject: [burtonrj/CytoPy] Unable to install because of KDEpy version conflict (Issue #33)

pip install cytopy gives this error: The conflict is caused by: cytopy 2.0.1 depends on KDEpy==1.0.10 cytopy 2.0 depends on KDEpy==1.0.10

pip install kdepy==1.0.10 gives this error: ERROR: Could not find a version that satisfies the requirement kdepy==1.0.10 (from versions: 0.1, 0.2, 0.3, 0.4, 0.5, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.5.5, 0.5.6, 0.6, 0.6.9, 0.6.10, 0.6.11, 1.0.2, 1.0.11, 1.1.0) ERROR: No matching distribution found for kdepy==1.0.10

Installing KDEpy 1.0.11 or 1.1.0 does not work.

— Reply to this email directly, view it on GitHubhttps://github.com/burtonrj/CytoPy/issues/33, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA7ESAK6Y6FZXRTO4CPU4Q3VT4ME7ANCNFSM53P3OZRQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>

AlexanderWMacFarlaneIV commented 2 years ago

how do I tell pip to ignore dependencies?

burtonrj commented 2 years ago

Hi guys, thanks for helping each other out on this. I realise that CytoPy has fallen behind on dependencies. I'm currently in the final stages of my PhD and working around the clock to finish my thesis, once submitted CytoPy will have my full attention - Oct 2022 is the current timelines.

For now follow @jfgonsalves advice. Install KDEpy then install CytoPy with

pip install cytopy --no-dependencies
AlexanderWMacFarlaneIV commented 2 years ago

That worked.

Thanks for the quick replies.

I am trying to get FlowSOM clustering to work in Python for the puropse of automatic gating and identification of peripheral blood lymphocyte populations.

Maybe it could be integrated into CytoPy if I am able to get it working and you decide to take this up again.

jfgonsalves commented 2 years ago

Ross has actually implemented FlowSOM already and it works a treat! Take a look at the tutorial notebooks – it’s done quite elegantly!

From: AlexanderWMacFarlaneIV @.> Date: Thursday, 14 July 2022 at 7:04 am To: burtonrj/CytoPy @.> Cc: Filipe Gonsalves @.>, Mention @.> Subject: Re: [burtonrj/CytoPy] Unable to install because of KDEpy version conflict (Issue #33)

That worked.

Thanks for the quick replies.

I am trying to get FlowSOM clustering to work in Python for the puropse of automatic gating and identification of peripheral blood lymphocyte populations.

Maybe it could be integrated into CytoPy if I am able to get it working and you decide to take this up again.

— Reply to this email directly, view it on GitHubhttps://github.com/burtonrj/CytoPy/issues/33#issuecomment-1183676281, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA7ESAOV4BEJ7KFGNQY2IO3VT4VMHANCNFSM53P3OZRQ. You are receiving this because you were mentioned.Message ID: @.***>

AlexanderWMacFarlaneIV commented 2 years ago

That's what brought me here.

I meant maybe automatic gating and identification of lymphocyte populations could be integrated into cytopy if I can get it working.

burtonrj commented 2 years ago

If you're just interested in FlowSOM I have a new package called CytoCluster that is much simpler to use but requires your data is already clean and ready to cluster

AlexanderWMacFarlaneIV commented 2 years ago

I can show you some code but drag and drop doesn't work in this window with .py files

burtonrj commented 2 years ago

v3.0 of CytoPy is going to decouple many of the tools so that it's not such a behemoth, so there will be CytoPy which will depend on some smaller packages that can be used in isolation for those that don't need automated gating and a complex MongoDB database to track meta-data.

Some of this work I've already started: CytoTools [https://github.com/burtonrj/CytoTools] - reading files, transformations, dimension reduction, peak alignment and more CytoPlots [https://github.com/burtonrj/CytoPlots] - classic facs plots in Python and some more general tools for stuff like scatterplots to be used with UMAP, tSNE etc CytoCluster [https://github.com/burtonrj/CytoCluster] - as long as you have your results in a Pandas DataFrame you can apply all your popular clustering algorithms plus a novel ensemble clustering algorithm currently under peer-review (pre-print here: https://doi.org/10.1101/2022.06.30.496829)

So if CytoPy v2.0 is being a pain, or feels a bit suffocating, or you just want more freedom, try using a combination of these packages.

Apologies that the documentation is currently a bit short for the three packages above, as I said I'm very short on time at the moment and haven't quite given it all the attention I would like.

AlexanderWMacFarlaneIV commented 2 years ago

This reads in an fcs file, picks out 3 channels with CD3, CD19, and CD56, then identifies and plots T, B, and NK cells with KMeans clustering. You should be able to save it as .py and have it run.

import matplotlib.pyplot as plt import numpy as np import flowkit as fk from sklearn.cluster import KMeans ''' Get fcs file from specified location and call it inputArray ''' fcs_path = r'C:\Eclipse\ClusterTest\SourceFolder\GU114_28_T1_Treg_Lymphocytes.fcs' rawData = fk.Sample(fcs_path) inputArray = rawData._raw_events inputArray = inputArray[np.all(inputArray >= -500, axis = 1)] # get rid of extreme negative outliers ''' Extract parameters of interest from fcs file ''' inputData_CD3 = inputArray[:,12] # Index = 0 inputData_CD56 = inputArray[:,6] # Index = 1 inputData_CD19 = inputArray[:,16] # Index = 2 inputData_CD4 = inputArray[:,9] # Index = 3 inputData_CD8 = inputArray[:,14] # Index = 4 inputData_CD27 = inputArray[:,8] # Index = 5 inputData_CD127 = inputArray[:,3] # Index = 6 inputData_CD25 = inputArray[:,11] # Index = 7 inputData_CD45 = inputArray[:,10] # Index = 8 inputData_FSCA = inputArray[:,0] # Index = 9 inputData_SSCA = inputArray[:,2] # Index = 10 inputData_CD11b = inputArray[:,5] # Index = 11 inputData_KIR3DL1 = inputArray[:,4] # Index = 12 inputData_KIR3DL1_S1 = inputArray[:,13] # Index = 13 inputArray = np.transpose(np.vstack((inputData_CD3, inputData_CD3, inputData_CD19,inputData_CD4, inputData_CD8, inputData_CD27, inputData_CD127, inputData_CD25, inputData_CD45, inputData_FSCA, inputData_SSCA, inputData_CD11b, inputData_KIR3DL1, inputData_KIR3DL1_S1 ))) gatingArray_CD3CD56CD19 = np.transpose(np.vstack((inputData_CD3, inputData_CD56, inputData_CD19))) gatingArray_CD3CD56CD19 = np.transpose(np.vstack((inputData_CD3, inputData_CD56, inputData_CD19))) ''' Translate data so the origin is at [1,1,1] ''' gatingArray_CD3CD56CD19[:,0] -= gatingArray_CD3CD56CD19[:,0].min() - 1 gatingArray_CD3CD56CD19[:,1] -= gatingArray_CD3CD56CD19[:,1].min() - 1 gatingArray_CD3CD56CD19[:,2] -= gatingArray_CD3CD56CD19[:,2].min() - 1 ''' Construct 3D array of log scaled data ''' gatingArray_CD3CD56CD19_log = np.log10(gatingArray_CD3CD56CD19) data_CD3_log = gatingArray_CD3CD56CD19_log[:,0] data_CD56_log = gatingArray_CD3CD56CD19_log[:,1] data_CD19_log = gatingArray_CD3CD56CD19_log[:,2] ''' Perform clustering and extract labels ''' clusterResults = KMeans(n_clusters=4).fit(gatingArray_CD3CD56CD19log) labels = clusterResults.labels num_clusters = np.max(labels) + 1
samples_by_cluster = dict() for i in range(num_clusters): samples_by_cluster[i] = np.flatnonzero(labels == i) ''' get index numbers of cluster members ''' c0_TBNK0ther = samples_by_cluster[0] c1_TBNK0ther = samples_by_cluster[1] c2_TBNK0ther = samples_by_cluster[2] c3_TBNK0ther = samples_by_cluster[3] ''' Extract clusters members from the 3D array ''' set0_TBNK0ther = gatingArray_CD3CD56CD19_log[c0_TBNK0ther,:] set1_TBNK0ther = gatingArray_CD3CD56CD19_log[c1_TBNK0ther,:] set2_TBNK0ther = gatingArray_CD3CD56CD19_log[c2_TBNK0ther,:] set3_TBNK0ther = gatingArray_CD3CD56CD19_log[c3_TBNK0ther,:] ''' identify T Cell Cluster''' itentify_T = np.transpose(np.array([[0, 1, 2, 3],[np.mean(set0_TBNK0ther[:,0]), np.mean(set1_TBNK0ther[:,0]), np.mean(set2_TBNK0ther[:,0]), np.mean(set3_TBNK0ther[:,0])]])) clusterNumberOfTcellsOnTop = np.flip(itentify_T[itentify_T[:,1].argsort()],0).astype(int) clusterID_Tcells = clusterNumberOfTcellsOnTop[0,0] clusterOfTCellsIdentity = samples_by_cluster[clusterID_Tcells] clusterOfTCells = gatingArray_CD3CD56CD19_log[clusterOfTCellsIdentity,:] '''Axes for Plots ''' clusterOfTCells_CD3 = clusterOfTCells[:,0] clusterOfTCells_CD56 = clusterOfTCells[:,1] clusterOfTCells_CD19 = clusterOfTCells[:,2] ''' identify B Cell Cluster''' itentify_B = np.transpose(np.array([[0, 1, 2, 3],[np.mean(set0_TBNK0ther[:,2]), np.mean(set1_TBNK0ther[:,2]), np.mean(set2_TBNK0ther[:,2]), np.mean(set3_TBNK0ther[:,2])]])) clusterNumberOfBcellsOnTop = np.flip(itentify_B[itentify_B[:,1].argsort()],0).astype(int) clusterID_Bcells = clusterNumberOfBcellsOnTop[0,0] clusterOfBCellsIdentity = samples_by_cluster[clusterID_Bcells] clusterOfBCells = gatingArray_CD3CD56CD19_log[clusterOfBCellsIdentity,:] '''Axes for Plots ''' clusterOfBCells_CD3 = clusterOfBCells[:,0] clusterOfBCells_CD56 = clusterOfBCells[:,1] clusterOfBCells_CD19 = clusterOfBCells[:,2]

''' identify NK Cell Cluster''' itentify_NK = np.transpose(np.array([[0, 1, 2, 3],[np.mean(set0_TBNK0ther[:,1]), np.mean(set1_TBNK0ther[:,1]), np.mean(set2_TBNK0ther[:,1]), np.mean(set3_TBNK0ther[:,1])]])) clusterNumberOfNKcellsOnTop = np.flip(itentify_NK[itentify_NK[:,1].argsort()],0).astype(int) clusterID_NKcells = clusterNumberOfNKcellsOnTop[0,0] clusterOfNKCellsIdentity = samples_by_cluster[clusterID_NKcells] clusterOfNKCells = gatingArray_CD3CD56CD19_log[clusterOfNKCellsIdentity,:] ''' Remove CD56+ T Contamination (Need to remove from all sub-parameters of NK (Here KIR3DL1))''' clusterOfNKCells = clusterOfNKCells[np.where(clusterOfNKCells[:,0] <= 3.5)] '''Axes for Plots ''' clusterOfNKCells_CD3 = clusterOfNKCells[:,0] clusterOfNKCells_CD56 = clusterOfNKCells[:,1] clusterOfNKCells_CD19 = clusterOfNKCells[:,2] '''Process of Elimination''' fourClusters = np.arange(4) threeClusters = fourClusters[np.where(fourClusters != clusterID_Tcells)] twoClusters = threeClusters[np.where(threeClusters != clusterID_NKcells)] clusterID_Other = twoClusters[np.where(twoClusters != clusterID_Bcells)][0] clusterOfOtherIdentity = samples_by_cluster[clusterID_Other] clusterOfOther = gatingArray_CD3CD56CD19_log[clusterOfOtherIdentity,:] clusterOfOther_CD3 = clusterOfOther[:,0] clusterOfOther_CD56 = clusterOfOther[:,1] clusterOfOther_CD19 = clusterOfOther[:,2]

fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(clusterOfTCells_CD19, clusterOfTCells_CD56, clusterOfTCells_CD3, s=1, color="blue") ax.scatter(clusterOfBCells_CD19, clusterOfBCells_CD56, clusterOfBCells_CD3, s=1, color="orange") ax.scatter(clusterOfNKCells_CD19, clusterOfNKCells_CD56, clusterOfNKCells_CD3, s=1, color="red") ax.scatter(clusterOfOther_CD19, clusterOfOther_CD56, clusterOfOther_CD3, s=1, color="green") ax.set_xlabel('CD19') ax.set_ylabel('CD56') ax.set_zlabel('CD3') ax.set_xlim(1,6) ax.set_ylim(1,6) ax.set_zlim(1,6) plt.show()

AlexanderWMacFarlaneIV commented 2 years ago

Except that you don't have the fcs file...

I renamed it to .txt so I can upload it, but it is an fcs file. Uploading now.

AlexanderWMacFarlaneIV commented 2 years ago

GU114_28_T1_Treg_Lymphocytes.txt

AlexanderWMacFarlaneIV commented 2 years ago

Thanks for the help guys but this rabbit hole keeps going deeper. I can't use cytopy without Mongo DB, and the install of cytocluster is broken by a need for kahypar 1.1.3 when pip says only 1.0 exists.

Hatchin's implementation of FlowSOM, which Cytopy references is broken because something like as.matrix is no longer supported in Pandas.

Unfortunately getting FlowSOM to work in Python is more complicated than I have time to deal with. I guess I will do it in R, or maybe call the R function from Python.

Please let me know if you ever get back to this and it becomes functional.

Thank you for taking the time to talk to me.

Alex

AlexanderWMacFarlaneIV commented 2 years ago

This is the answer to the original question:

KDEpy 1.0.10 is compiled for Python versions 3.6, 3.7, and 3.8. I am on 3.9.