uec / Issue.Tracker

Automatically exported from code.google.com/p/usc-epigenome-center
0 stars 0 forks source link

Submit example Level II and III data to TCGA #7

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago

The guidelines for this have been recently revised several times, so we need to 
reformulate levels.  Steps:
1) Define levels and file formats (ben)
2) Create sample MAGE-tab (ben)
3) Submit to Ari for approval (ben)
4) (Concurrently) Zack formulate a plan to generate these automatically using 
pipeline.

----

Hi Ben,

Please see the RNASeq specification 
https://wiki.nci.nih.gov/display/TCGA/RNASeq+Data+Format+Specification

Level 1 - BAM
Level 2 - WIG and ?Variants?
Level 3 - Quantification-DNA Methylation

Will there be variant data?

Example MAGE-TAB
https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor
/coad/cgcc/unc.edu/RNASeq/unc.edu_COAD.IlluminaGA_RNASeq.mage-tab.1.0.0/
The only thing that doesn't jive with the spec is that WIG files are listed as 
level 3 in this example. WIG is now level 2.

Do create some example/text MAGE-TAB docs to share with us.

I've stared a DNASeq based Methylation space in the Member wiki at 
https://wiki.nci.nih.gov/x/DwRhAg
to capture all the specifications. PLEASE feel free to add to, modify, or 
comment on anything in this space. If you do not have access to this wiki 
space, please contact NCICB Support http://ncicb.nci.nih.gov/NCICB/support

We really need to get some examples of your level 3 files and any other files 
if they differ form your current ones. Could you place some examples in your 
submission other directory and lets us know when the transfer has completed? 
Or, even better, if the files are not very large attach them to this wiki page 
https://wiki.nci.nih.gov/x/HARhAg

The most recent standalone validator is at https://wiki.nci.nih.gov/x/kA1LAQ
It validates WIG files right now if you use the flags "-noremote -centerType 
GSC". You will have to wait on submitting WIG files until the DCC can modify 
our production site to accept WIG files form non-GSCs.

We also need the following info:
1. Vendor Name
2. Platform Name
3. Suggested platform code
4. Web page URL that links to the vendor’s site that describes the new 
platform.

Original issue reported on code.google.com by benb...@gmail.com on 30 Mar 2011 at 7:55

GoogleCodeExporter commented 8 years ago
I am working on trying to finalize a MAGE-TAB format for WGBS.  I am basing it 
on a standard proposed by Prachi at the DCC.  I think we will leave out the VCF 
file initially, since it's protected.

Please let me know if you have any suggestions.  I would like to submit a 
sample version, along with a set of data files (bed6+2) to Anna as soon as 
possible.  Then we should work on automatically generating it from the same 
code as the SRA XML.  To avoid duplication, we may want to consider leaving 
mapper command line parameters out since they are already in the SRA XML.

Are we currently generating a global "coverage" wig for these (basically the 
"raw" findpeaks output)?  That seems to be something TCGA analysts would like, 
so we should generate it if we are not already.  In general, this should be in 
all pipelines, not just ChIP-seq.

Original comment by benb...@gmail.com on 2 Aug 2012 at 1:25

Attachments:

GoogleCodeExporter commented 8 years ago

Original comment by benb...@gmail.com on 16 Aug 2012 at 2:48

GoogleCodeExporter commented 8 years ago
sample submission document for TCGA-AA-3518 sent to group for review

Original comment by zack...@gmail.com on 23 Aug 2012 at 9:55

GoogleCodeExporter commented 8 years ago
Please get Bis-SNP info from Yaping and coordinate with Moiz to send to Anna 
(cc'ing Prachi since this is based on her proposed sequencing spec).

In general, as many of our pipeline components as possible should log command 
line arguments.  Maybe wrapper scripts should dump this to stdout where run 
scraper can throw them into the QC database?

Original comment by benb...@gmail.com on 23 Aug 2012 at 10:26

GoogleCodeExporter commented 8 years ago
Update: 
I've given the COAD submission to prachi for review. There are still details 
regarding the official spec that she is working out with SWG. Anna is not 
longer with TCGA, so all sequencing submission issues should go to Prachi.

As of now, the ball is out of our court. We will need to wait for them to 
finalize the spec (hopefully taking into account our our recommendations)  and 
get back to us with the final submission procedures.  

Original comment by zack...@gmail.com on 7 Sep 2012 at 12:45

GoogleCodeExporter commented 8 years ago
I think maybe we only sent them the MAGETAB file, not the level2 level3 data.  
I will send it to Todd Pihl

Original comment by benb...@gmail.com on 8 Nov 2012 at 10:05

GoogleCodeExporter commented 8 years ago

Original comment by zack...@gmail.com on 12 Feb 2014 at 11:33

GoogleCodeExporter commented 8 years ago
submitted all levels

Original comment by zack...@gmail.com on 23 Jul 2014 at 6:03