Closed taopeng1100 closed 3 years ago
The tutorial has a section about retrieving CDR3 sequences: Public, private and condition-specific clonotypes
meta = 'donor' # here you specify the category for clustering
adata = pyvdj.stats(adata, meta)
cdr3 = adata.uns['pyvdj']['stats'][meta]['cdr3']
Details of the stats() function here. Then you can query the clusters, for example set(cdr3['c1'])
. Alternatively, you can just take the original csv file with the VDJ data, and filter for the cells of interest. This requires no pyVDJ.
For gene usage, the scirpy package is probably a better choice. Another option is Immunarch. For example, geneUsage().
meta = 'adata.obs[["leiden"]=="4"]' # here you specify the category for clustering adata = pyvdj.stats(adata, meta) cdr3 = adata.uns['pyvdj']['stats'][meta]['cdr3']
I go the error:
KeyError Traceback (most recent call last)
Tao
From: Peter Vegh notifications@github.com Sent: Wednesday, October 14, 2020 12:19 PM To: veghp/pyVDJ pyVDJ@noreply.github.com Cc: Peng, Tao tpeng@fredhutch.org; Author author@noreply.github.com Subject: Re: [veghp/pyVDJ] Find VDJ usages and CDR3 for cells in specific leiden clusters (#6)
The tutorialhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_veghp_pyVDJ_tree_master_tutorials&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4FxwjIGkovrYhcSF_Q3eiQpulIQ-2PDnXHF0BJYY9xs&s=a6RNm645Ytr7LaQBrJ35XWonrNhNbodOFqmOh9fcABk&e= has a section about retrieving CDR3 sequences: Public, private and condition-specific clonotypes
meta = 'donor' # here you specify the category for clustering
adata = pyvdj.stats(adata, meta)
cdr3 = adata.uns['pyvdj']['stats'][meta]['cdr3']
Details of the stats() function herehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_veghp_pyVDJ_blob_760c80fb7c7ecad18cb78b0a66d42193e8609667_pyvdj_stats.py-23L40&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4FxwjIGkovrYhcSF_Q3eiQpulIQ-2PDnXHF0BJYY9xs&s=SsHEpW6Y3_e6Aryu9R0HHyMXzBSbUj_Y-CuI-k2XlW4&e=. Then you can query the clusters, for example set(cdr3['c1']). Alternatively, you can just take the original csv file with the VDJ data, and filter for the cells of interest. This requires no pyVDJ.
For gene usage, the scirpy package is probably a better choice. Another option is Immunarchhttps://urldefense.proofpoint.com/v2/url?u=https-3A__immunarch.com_index.html&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4FxwjIGkovrYhcSF_Q3eiQpulIQ-2PDnXHF0BJYY9xs&s=wd9PSxLKOtPSYZMSWiLG9q5C-InwNm0BHTZOyjcFuxw&e=. For example, geneUsage()https://urldefense.proofpoint.com/v2/url?u=https-3A__immunarch.com_reference_geneUsage.html&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4FxwjIGkovrYhcSF_Q3eiQpulIQ-2PDnXHF0BJYY9xs&s=wqd2rT4AGmTeOHwZAv1FvmhlfwqeNQyBYxAF11X2-5M&e=.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_veghp_pyVDJ_issues_6-23issuecomment-2D708608220&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4FxwjIGkovrYhcSF_Q3eiQpulIQ-2PDnXHF0BJYY9xs&s=Q0hgeaK0zJl2T2mvNvbRietsngzIk54AVNXNWarvHec&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ALCZ3CMCN3KGJCU6LG5Y3C3SKX2QPANCNFSM4SQ7NBSQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4FxwjIGkovrYhcSF_Q3eiQpulIQ-2PDnXHF0BJYY9xs&s=VLYmdzjnclMxCzD8JzKzVSanVn5H14hkeGizkI3gP64&e=.
I believe it should be just meta = 'leiden'
-- in my example, donor
was the name of the category (column).
Then you should be able to list CDR3 in 4 with cdr3['4'])
I think I did not explain well. I have one sample. After UMAP clustering analysis, I have single cells into 12 clusters (leiden cluster). I like to see what are the clonotypes in cluster 4 with VDJ gene usages and CD3 seq.
Tao
From: Peter Vegh notifications@github.com Sent: Wednesday, October 14, 2020 1:05 PM To: veghp/pyVDJ pyVDJ@noreply.github.com Cc: Peng, Tao tpeng@fredhutch.org; Author author@noreply.github.com Subject: Re: [veghp/pyVDJ] Find VDJ usages and CDR3 for cells in specific leiden clusters (#6)
I believe it should be just meta = '4' -- in my example, donor was the name of the category (column). Then you should be able to list CDR3 in 4 with cdr3['4'])
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_veghp_pyVDJ_issues_6-23issuecomment-2D708629922&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4NeHfivR7HSjKlUSYHnf6jgHhL0dAuxE8iYyCNB6-V8&s=V8vL-dnIPiiBDEomx9NqjSDifty3iRVOBQSyLv5UEwE&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ALCZ3COCRNUTICPJBSW4GJLSKX7YRANCNFSM4SQ7NBSQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=4NeHfivR7HSjKlUSYHnf6jgHhL0dAuxE8iYyCNB6-V8&s=yNiKhqGvQBD1mFq5NnFd9glB-ltbX0wturS5JqzUIuk&e=.
I think you will need to get the list of cells in cluster 4, then filter the VDJ dataframe using these cells. Then you can have a look at the data/columns. I hope something like this helps (not tried):
cluster4_cells = adata.obs.loc[adata.obs["leiden"].isin(["4"])]['vdj_obs'] .tolist()
# where vdj_obs is the columnname you generated for loading in the data
vdjdf = adata.uns['pyvdj']['df']
vdj_cluster4 = vdjdf.loc[vdjdf['barcode_meta'].isin(cells)]
vdj_cluster4['cdr3']
However, it is probably better to use another up-to-date package. I'll archive this project because it is not maintained anymore.
Appreciate your help!
From: Peter Vegh notifications@github.com Sent: Thursday, October 15, 2020 7:28 AM To: veghp/pyVDJ pyVDJ@noreply.github.com Cc: Peng, Tao tpeng@fredhutch.org; Author author@noreply.github.com Subject: Re: [veghp/pyVDJ] Find VDJ usages and CDR3 for cells in specific leiden clusters (#6)
I think you will need to get the list of cells in cluster 4, then filter the VDJ dataframe using these cells. Then you can have a look at the data/columns. I hope something like this helps (not tried):
cluster4_cells = adata.obs.loc[adata.obs["leiden"].isin(["4"])]['vdj_obs'] .tolist()
# where vdj_obs is the columnname you generated for loading in the data
vdjdf = adata.uns['pyvdj']['df']
vdj_cluster4 = vdjdf.loc[vdjdf['barcode_meta'].isin(cells)]
vdj_cluster4['cdr3']
However, it is probably better to use another up-to-date package. I'll archive this project because it is not maintained anymore.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_veghp_pyVDJ_issues_6-23issuecomment-2D709363615&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=WoZ1se9Va1b77d4dmWoNdJkvs0AnUhBlVJR1gze0rtI&s=uCnajwIHwGk8qrgnsTIilN8r7yslmdDjclJaloYunzw&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ALCZ3CJLX3NQOFPRYS2JP2DSK4BIBANCNFSM4SQ7NBSQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=j6EgtEBZ-6pbDONgnwVzuTHJ6L-gWcikckOhZCwVjPc&m=WoZ1se9Va1b77d4dmWoNdJkvs0AnUhBlVJR1gze0rtI&s=H65ULcA_axXmKl-sv104Uz4CaUObjNfB97N-b0W1RKM&e=.
Can you help me in defining VDJ gene usages and CDR3 seq for cells in specific leiden clusters?
Thx!
Tao