Open seth-ament opened 3 years ago
I need to run a script which looks through the recently-uploaded files and adds primary analyses so that it shows up in the workbench. Yes, this isn't ideal and it should happen automatically upon upload - it just keeps being one of the things that isn't prioritized. Are these new datasets for which this hasn't happened yet? And also, yes, during upload the user should be able to specify which column provides cluster information. For now, the script auto-recognizes these names:
['cluster', 'cell_type', 'cluster_label', 'subclass_label']
thanks for that info joshua, another question: if there is more than 1 of those col names, which is used? the 1st? when we find time to do this, it would be great if we can find a solution that can work on column names that we specify and without having to add column/sample meta data and re-upload, in order to avoid recreating all the profiles and curated views that we have invested time in. carlo
On Tue, Feb 16, 2021 at 11:02 PM Joshua Orvis notifications@github.com wrote:
I need to run a script which looks through the recently-uploaded files and adds primary analyses so that it shows up in the workbench. Yes, this isn't ideal and it should happen automatically upon upload - it just keeps being one of the things that isn't prioritized. Are these new datasets for which this hasn't happened yet? And also, yes, during upload the user should be able to specify which column provides cluster information. For now, the script auto-recognizes these names:
['cluster', 'cell_type', 'cluster_label', 'subclass_label']
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780281338, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7VAOZWTWBISXFYLR7TS7M5TXANCNFSM4XWSAU6Q .
-- Carlo
Thanks, Joshua!
Carlo has curated several single-cell datasets in the CarloTEMP profile that seem not to have been connected yet to the workbench. Carlo, could you give Joshua a list of the datasets?
sure!
Human Fetal Cortex scRNAseq (Nowakowski et al. 2017 Science) - using column "majortype4" Molecular identity of human outer radial glia during cortical development (Pollen et al 2015) - using column " Cell Class" Temporal patterning of apical progenitors and their daughter neurons in the developing neocortex (Telley et al 2019) - using column "cell.age.ch1" Single-cell RNAseq from mid-gestation human fetal cortex (Polioudakis et al 2019): subset - donor.372 (excitatory neurogenesis only) - using column "Cluster" Single-cell RNAseq from mid-gestation human fetal cortex (Polioudakis et al 2019): subset - donor.371 (excitatory neurogenesis only) - using column "Cluster" Single-cell RNAseq from mid-gestation human fetal cortex (Polioudakis et al 2019): subset - donor.370 (excitatory neurogenesis only) - using column "Cluster" Single-cell RNAseq from mid-gestation human fetal cortex (Polioudakis et al 2019): subset - donor.368 (excitatory neurogenesis only) - using column "Cluster" Single-cell RNAseq of 3872 cells across 22 regions in human fetal brain tissue at 22/23GW (Fan et al 2018) - using column "Group.1" Single-cell RNAseq of 2310 cells in human fetal cortex 8-26GW. (Zhong et al 2018) - using column "CellType02" Single-cell RNAseq of 12448 cells from human fetal cortex GW7-28 (Fan et al 2020) - using column "type"
thanks! carlo
On Tue, Feb 16, 2021 at 11:39 PM Seth Ament notifications@github.com wrote:
Thanks, Joshua!
Carlo has curated several single-cell datasets in the CarloTEMP profile that seem not to have been connected yet to the workbench. Carlo, could you give Joshua a list of the datasets?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780292859, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7XQRPIXPKHZ3KEIGQTS7NCBRANCNFSM4XWSAU6Q .
-- Carlo
thanks for that info joshua, another question: if there is more than 1 of those col names, which is used? the 1st? when we find time to do this, it would be great if we can find a solution that can work on column names that we specify and without having to add column/sample meta data and re-upload, in order to avoid recreating all the profiles and curated views that we have invested time in. carlo …
The 1st in that list is the one which is taken, yes. And agreed about making it configurable.
I have run this script which looks at ALL uploaded datasets which are missing primary analysis and adds them. You can check out any existing datasets now.
Awesome, thanks! did you do this with the automatic column name recognition, or the specific ones i sent? i'll go look at some now, carlo
On Wed, Feb 17, 2021 at 2:26 AM Joshua Orvis notifications@github.com wrote:
I have run this script which looks at ALL uploaded datasets which are missing primary analysis and adds them. You can check out any existing datasets now.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780362439, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7WXMLELTRBMDUIPFJ3S7NVQ3ANCNFSM4XWSAU6Q .
-- Carlo
Sorry, I had only done it on the auto one first, but can redo it so those are recognized too this afternoon
thanks so much Joshua!
On Wed, Feb 17, 2021 at 12:10 PM Joshua Orvis notifications@github.com wrote:
Sorry, I had only done it on the auto one first, but can redo it so those are recognized too this afternoon
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780707435, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7SUBZJEIIUGU4VZST3S7PZ6RANCNFSM4XWSAU6Q .
-- Carlo
when exploring one of the datasets that had "Cluster" as the column name (and therefore worked when u ran the script), i can see the primary analysis is there, and the tSNE plot works, but when i ask for marker genes, i get : "
" this was for the " Single-cell RNAseq from mid-gestation human fetal cortex (Polioudakis et al 2019): subset - donor.368 (excitatory neurogenesis only) " dataset
On Wed, Feb 17, 2021 at 12:52 PM Carlo Colantuoni colantuonicarlo@gmail.com wrote:
thanks so much Joshua!
On Wed, Feb 17, 2021 at 12:10 PM Joshua Orvis notifications@github.com wrote:
Sorry, I had only done it on the auto one first, but can redo it so those are recognized too this afternoon
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780707435, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7SUBZJEIIUGU4VZST3S7PZ6RANCNFSM4XWSAU6Q .
-- Carlo
-- Carlo
i understand if you want to drop this for now as u finish data set manager / gEAR manuscript issues
On Wed, Feb 17, 2021 at 12:58 PM Carlo Colantuoni colantuonicarlo@gmail.com wrote:
when exploring one of the datasets that had "Cluster" as the column name (and therefore worked when u ran the script), i can see the primary analysis is there, and the tSNE plot works, but when i ask for marker genes, i get : "
- Error reporting marker genes
" this was for the " Single-cell RNAseq from mid-gestation human fetal cortex (Polioudakis et al 2019): subset - donor.368 (excitatory neurogenesis only) " dataset
On Wed, Feb 17, 2021 at 12:52 PM Carlo Colantuoni < colantuonicarlo@gmail.com> wrote:
thanks so much Joshua!
On Wed, Feb 17, 2021 at 12:10 PM Joshua Orvis notifications@github.com wrote:
Sorry, I had only done it on the auto one first, but can redo it so those are recognized too this afternoon
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780707435, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7SUBZJEIIUGU4VZST3S7PZ6RANCNFSM4XWSAU6Q .
-- Carlo
-- Carlo
-- Carlo
Trying to wrap up the tasks I had ongoing. One thing, Shaun identified that the marker gene step is case-sensitive for the gene names you search. He plans to have a fix in the coming days, but could you try to be sure to enter the gene names with the proper case for your organism and see if that's the problem atm?
thnx joshua
On Wed, Feb 17, 2021 at 2:40 PM Joshua Orvis notifications@github.com wrote:
Trying to wrap up the tasks I had ongoing. One thing, Shaun identified that the marker gene step is case-sensitive for the gene names you search. He plans to have a fix in the coming days, but could you try to be sure to enter the gene names with the proper case for your organism and see if that's the problem atm?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780803221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7ULNQRBKSWDKMUL3GDS7QLUTANCNFSM4XWSAU6Q .
-- Carlo
to be clear - the " Error reporting marker genes " error came when i asked the workbench to get marker genes from clusters, not when i ask for a specific gene name
On Wed, Feb 17, 2021 at 2:42 PM Carlo Colantuoni colantuonicarlo@gmail.com wrote:
thnx joshua
On Wed, Feb 17, 2021 at 2:40 PM Joshua Orvis notifications@github.com wrote:
Trying to wrap up the tasks I had ongoing. One thing, Shaun identified that the marker gene step is case-sensitive for the gene names you search. He plans to have a fix in the coming days, but could you try to be sure to enter the gene names with the proper case for your organism and see if that's the problem atm?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780803221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7ULNQRBKSWDKMUL3GDS7QLUTANCNFSM4XWSAU6Q .
-- Carlo
-- Carlo
hey joshua, when u get the data set manager and gEAR manuscript off your plate: can u run the "script which looks at ALL uploaded datasets which are missing primary analysis and adds them" again? i have put up some new data since you sent the note above. thnx, carlo
On Wed, Feb 17, 2021 at 3:11 PM Carlo Colantuoni colantuonicarlo@gmail.com wrote:
to be clear - the " Error reporting marker genes " error came when i asked the workbench to get marker genes from clusters, not when i ask for a specific gene name
On Wed, Feb 17, 2021 at 2:42 PM Carlo Colantuoni < colantuonicarlo@gmail.com> wrote:
thnx joshua
On Wed, Feb 17, 2021 at 2:40 PM Joshua Orvis notifications@github.com wrote:
Trying to wrap up the tasks I had ongoing. One thing, Shaun identified that the marker gene step is case-sensitive for the gene names you search. He plans to have a fix in the coming days, but could you try to be sure to enter the gene names with the proper case for your organism and see if that's the problem atm?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-780803221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7ULNQRBKSWDKMUL3GDS7QLUTANCNFSM4XWSAU6Q .
-- Carlo
-- Carlo
-- Carlo
I just ran it again.
On Fri, Feb 19, 2021 at 2:54 AM Carlo Colantuoni notifications@github.com wrote:
hey joshua, when u get the data set manager and gEAR manuscript off your plate: can u run the "script which looks at ALL uploaded datasets which are missing primary analysis and adds them" again? i have put up some new data since you sent the note above. thnx, carlo
On Wed, Feb 17, 2021 at 3:11 PM Carlo Colantuoni < colantuonicarlo@gmail.com> wrote:
to be clear - the " Error reporting marker genes " error came when i asked the workbench to get marker genes from clusters, not when i ask for a specific gene name
On Wed, Feb 17, 2021 at 2:42 PM Carlo Colantuoni < colantuonicarlo@gmail.com> wrote:
thnx joshua
On Wed, Feb 17, 2021 at 2:40 PM Joshua Orvis notifications@github.com wrote:
Trying to wrap up the tasks I had ongoing. One thing, Shaun identified that the marker gene step is case-sensitive for the gene names you search. He plans to have a fix in the coming days, but could you try to be sure to enter the gene names with the proper case for your organism and see if that's the problem atm?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/nemoarchive/analytics/issues/149#issuecomment-780803221 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AH7KC7ULNQRBKSWDKMUL3GDS7QLUTANCNFSM4XWSAU6Q
.
-- Carlo
-- Carlo
-- Carlo
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-781931844, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACQZE4EDN56HGAKWWLNWDDS7YRLDANCNFSM4XWSAU6Q .
Thanks!
On Wed, Feb 24, 2021 at 11:36 AM Joshua Orvis notifications@github.com wrote:
I just ran it again.
On Fri, Feb 19, 2021 at 2:54 AM Carlo Colantuoni <notifications@github.com
wrote:
hey joshua, when u get the data set manager and gEAR manuscript off your plate: can u run the "script which looks at ALL uploaded datasets which are missing primary analysis and adds them" again? i have put up some new data since you sent the note above. thnx, carlo
On Wed, Feb 17, 2021 at 3:11 PM Carlo Colantuoni < colantuonicarlo@gmail.com> wrote:
to be clear - the " Error reporting marker genes " error came when i asked the workbench to get marker genes from clusters, not when i ask for a specific gene name
On Wed, Feb 17, 2021 at 2:42 PM Carlo Colantuoni < colantuonicarlo@gmail.com> wrote:
thnx joshua
On Wed, Feb 17, 2021 at 2:40 PM Joshua Orvis < notifications@github.com> wrote:
Trying to wrap up the tasks I had ongoing. One thing, Shaun identified that the marker gene step is case-sensitive for the gene names you search. He plans to have a fix in the coming days, but could you try to be sure to enter the gene names with the proper case for your organism and see if that's the problem atm?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub <
https://github.com/nemoarchive/analytics/issues/149#issuecomment-780803221
,
or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AH7KC7ULNQRBKSWDKMUL3GDS7QLUTANCNFSM4XWSAU6Q
.
-- Carlo
-- Carlo
-- Carlo
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub < https://github.com/nemoarchive/analytics/issues/149#issuecomment-781931844 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AACQZE4EDN56HGAKWWLNWDDS7YRLDANCNFSM4XWSAU6Q
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/149#issuecomment-785205406, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7SVCTHVYT6L3VFSNULTAUTJHANCNFSM4XWSAU6Q .
-- Carlo
For many of the scRNA-seq datasets that Carlo and I have uploaded, the UMAP/tSNE displays well on the front page, but the clusters from the primary analysis are not available in the Single-Cell Workbench. I think this is an issue about how we are labeling columns in the observations tab. Could you remind us how to label everything so that the clusters are available in the workbench?
(Eventually, it would be nice for one to be able to specify the columns for the primary analysis while curating the dataset. But that's an enhancement that can wait.)