nemoarchive / analytics

Repository for the NeMO Analytics project.
MIT License
1 stars 0 forks source link

Incorrect gene symbols in POU3F2oeNPC #75

Closed jorvis closed 5 years ago

jorvis commented 5 years ago

Carlo - Are you responsible for the dataset here?

/local/projects-t3/NEMO/incoming/brain/other/grant/development/Sament/POU3F2oeNPC

It failed our upload processing because none of the genes could be looked up. Looking into the gene symbol (index) column, they are all like this:

15176.RN28S1 41553.RN28S1 6185.A_33_P3396434 53475.RPS2 6444.EEF1A1 22702.ZNF865 21636.RPL13 57926.PQLC2 26165.RPS27 42003.ACTB 51067.UBC

I assume the gene symbols are the parts after the period? How did this format happen?

carlocolantuoni commented 5 years ago

yes - thats all right. the data come from seths group. i moved it in. the proper gene symbols are in the gene/row meta data file in the "gene_symbol" column - can you use those?

On Wed, Oct 2, 2019 at 11:46 AM Joshua Orvis notifications@github.com wrote:

Assigned #75 https://github.com/nemoarchive/analytics/issues/75 to @carlocolantuoni https://github.com/carlocolantuoni.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/75?email_source=notifications&email_token=AH7KC7UJE4L5JP2QJMN75VDQMS65JA5CNFSM4I4YEFHKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOT7HGZMA#event-2681105584, or mute the thread https://github.com/notifications/unsubscribe-auth/AH7KC7WYXQWI22WIULIHCUDQMS65JANCNFSM4I4YEFHA .

-- Carlo

jorvis commented 5 years ago

I could modify the data to work, but brought it up so hopefully we can address the issue upstream so that our automated scripts will work on them. Currently the spec is that the files will have a first/index column of either ensembl IDs or gene symbols. Anything else and the scripts tho process them in an automated fashion will break.

carlocolantuoni commented 5 years ago

ok - i will reformat and reupload

On Wed, Oct 2, 2019 at 2:29 PM Joshua Orvis notifications@github.com wrote:

I could modify the data to work, but brought it up so hopefully we can address the issue upstream so that our automated scripts will work on them. Currently the spec is that the files will have a first/index column of either ensembl IDs or gene symbols. Anything else and the scripts tho process them in an automated fashion will break.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nemoarchive/analytics/issues/75?email_source=notifications&email_token=AH7KC7XOBPRLXRJ42UEH4XTQMTR75A5CNFSM4I4YEFHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAFXJEY#issuecomment-537621651, or mute the thread https://github.com/notifications/unsubscribe-auth/AH7KC7VZSNORFV7YY3WNEU3QMTR75ANCNFSM4I4YEFHA .

-- Carlo

carlocolantuoni commented 5 years ago

i re- uploaded it with gene symbols as rownames to: /autofs/encrypted/NEMO/incoming/brain/other/grant/development/Sament/POU3F2oeNPC

also - i cant find "ARKctxDevo400sc" under the datasets for the NemoCurator - how can i find it?

jorvis commented 5 years ago

To the user menu in the upper right and go to Dataset Manager. You'll need to have or create a profile you want to add the dataset to, then search for the dataset by entering "ARKctxDevo400sc" in the search box under the heading "Find other public datasets". Then add it to your profile so it will show up on the front page when you have that profile selected. Screenshot from 2019-10-07 09-19-25