uec / Issue.Tracker

Automatically exported from code.google.com/p/usc-epigenome-center
0 stars 0 forks source link

Submit Level I BAM data to SRA (using our TCGA dbGaP key) #6

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I have gotten permission to get a key from Martin Shumway, but have not yet 
received it.

On Fri, Mar 18, 2011 at 2:16 PM, Shumway, Martin (NIH/NLM/NCBI) [E] 
<shumwaym@ncbi.nlm.nih.gov> wrote:
> Hi Ben,
> We will create the necessary accounts for you.  Please look for a JIRA email 
with this subject in it for further instructions.
> Thanks, Martin

Once we get a key, the next steps are
1) Determine file format (ben)
2) Write a workflow to generate BAMs with correctly formatted headers and 
packaging. for initial colon data, we may have to do a one-off for this. (Zack)

Original issue reported on code.google.com by benb...@gmail.com on 30 Mar 2011 at 7:29

GoogleCodeExporter commented 8 years ago
Got word back from NCBI.  See attached text below.  Now there are two concrete:

1) Gather Center information (ben)
2) Generate Aspera key (Zack?)

----

Stine, Adam commented on TR-7121:
---------------------------------

Hi Ben,

I was waiting for a response from you, but due to a feature of our ticket 
tracker, you never actually were sent the email.  I apologize for the confusion.

In order to setup the transfer path for dbGaP data transfer we will need to do 
the following things:
1. Establish a Center account.
2. Establish an Aspera transfer account with the necessary keys.

To create a new Center, please provide the following information:
1. suggested center abbreviation (8 char max)
2. center name (full)
3. center URL
4. center mailing address (including country and postcode)
5. phone number (main phone for center or lab)
6. contact person (someone likely to remain at the location for an extended 
time)
7. contact email (ideally a service account monitored by several people)

To create the aspera account please see the following guide:
http://www.ncbi.nlm.nih.gov/books/NBK47532/#SRA_Submission_Guid.5_Submitting_Dat
a

Section 5.3.3 covers Aspera and provides the method for generating the key pair 
needed for the aspera accounts. Please pay close attention to the key formats.

Please let me know if you have any questions.

Original comment by benb...@gmail.com on 30 Mar 2011 at 7:49

GoogleCodeExporter commented 8 years ago
I've generated the putty keys using windows putty as specified by the doc. key 
files are on my desktop and sent via email to ben.

Original comment by zack...@gmail.com on 30 Mar 2011 at 7:50

GoogleCodeExporter commented 8 years ago
---------- Forwarded message ----------
From: Shumway, Martin (NIH/NLM/NCBI) [E] <shumwaym@ncbi.nlm.nih.gov>
Date: Mon, Apr 4, 2011 at 2:02 PM
Subject: RE: (TR-7121) protected sra access for new TCGA center
USC-JHU Genome Characterization Center
To: NLM/NCBI List tkt-trace <tkt-trace@ncbi.nlm.nih.gov>, Ben Berman
<benbfly@gmail.com>
- Hide quoted text -
Cc: "Sofia, Heidi (NIH/NHGRI) [E]" <heidi.sofia@nih.gov>

Hi Ben,

Here are some instructions for new submitters:

The documentation on analysis submissions is:
http://www.ncbi.nlm.nih.gov/books/NBK49167/

A specific protocol for TCGA submissions has been written:
http://www.ncbi.nlm.nih.gov/books/NBK51672/

Here is an example integrated submission including analysis,
experiment, and runs:
ftp://ftp.ncbi.nih.gov/sra/examples/SRA029111/

Best regards,
Martin Shumway
SRA curator

Original comment by benb...@gmail.com on 11 Apr 2011 at 10:46

GoogleCodeExporter commented 8 years ago
---------- Forwarded message ----------
From: Stine, Adam <stineaj@ncbi.nlm.nih.gov>
Date: Mon, Apr 4, 2011 at 1:51 PM
Subject: (TR-7121) protected sra access for new TCGA center USC-JHU Genome 
Characterization Center
To: Ben Berman <benbfly@gmail.com>, "Stine, Adam" <stineaj@ncbi.nlm.nih.gov>, 
tkt-trace@ncbi.nlm.nih.gov

   [ 
http://jira.be-md.ncbi.nlm.nih.gov/browse/TR-7121?page=com.atlassian.jira.plugin
.system.issuetabpanels:comment-tabpanel&focusedCommentId=1231320#action_1231320 
]

Stine, Adam commented on TR-7121:
---------------------------------

Hi Ben,

Your aspera account has been created.  Your account's name is : asp-usc-jhu

An example transfer would look like:

       ascp -i <key file>  -Q -l 200m <file(s) to transfer> 
asp-usc-jhu@gap-upload.ncbi.nlm.nih.gov:<directory>
-where  <directory> is either 'test' or 'protected'
Do not set the -T option for protected transfers.

To view what is currently in your account directories:
establish a secure connection to the SRA by using putty.exe along with your 
private key.
For example:
       putty.exe -i <key file>  asp-****@gap-upload.ncbi.nlm.nih.gov
'ssh' can also be used, but this will require an OpenSSH formatted version of 
your private key.  Puttygen can be used to convert keys.
Once connected, you may use the "ls" command to view the directory.

You will not be able to change directories (e.g., use of the "cd" command is 
disabled). Valid ls commands include:
       ls -l test
#lists the content of the 'test' subdir in long format
       ls -l incomimg
#lists the content of incoming subdir in long format

For file transfer tips please read
http://www.ncbi.nlm.nih.gov/books/NBK47527/

Please let me know if you have any questions.
Adam Stine

Original comment by benb...@gmail.com on 11 Apr 2011 at 10:51

GoogleCodeExporter commented 8 years ago
---------- Forwarded message ----------
From: Kahn, Ari (NIH/NCI) [C] <arik@mail.nih.gov>
Date: Tue, Mar 29, 2011 at 10:04 AM
Subject: Re: Protected SRA access for TCGA GCC
To: Ben Berman <benbfly@gmail.com>
Cc: List TCGA-DCC-BINF-L <TCGA-DCC-BINF-L@list.nih.gov>, Zack Ramjan 
<ramjan@usc.edu>, PeterLaird <plaird@usc.edu>

Hi Ben,

Please see the RNASeq specification 
https://wiki.nci.nih.gov/display/TCGA/RNASeq+Data+Format+Specification

Level 1 - BAM
Level 2 - WIG and ?Variants?
Level 3 - Quantification-DNA Methylation

Will there be variant data?

Example MAGE-TAB
https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor
/coad/cgcc/unc.edu/RNASeq/unc.edu_COAD.IlluminaGA_RNASeq.mage-tab.1.0.0/
The only thing that doesn't jive with the spec is that WIG files are listed as 
level 3 in this example. WIG is now level 2.

Do create some example/text MAGE-TAB docs to share with us.

I've stared a DNASeq based Methylation space in the Member wiki at 
https://wiki.nci.nih.gov/x/DwRhAg
to capture all the specifications. PLEASE feel free to add to, modify, or 
comment on anything in this space. If you do not have access to this wiki 
space, please contact NCICB Support http://ncicb.nci.nih.gov/NCICB/support

We really need to get some examples of your level 3 files and any other files 
if they differ form your current ones. Could you place some examples in your 
submission other directory and lets us know when the transfer has completed? 
Or, even better, if the files are not very large attach them to this wiki page 
https://wiki.nci.nih.gov/x/HARhAg

The most recent standalone validator is at https://wiki.nci.nih.gov/x/kA1LAQ
It validates WIG files right now if you use the flags "-noremote -centerType 
GSC". You will have to wait on submitting WIG files until the DCC can modify 
our production site to accept WIG files form non-GSCs.

We also need the following info:
1. Vendor Name
2. Platform Name
3. Suggested platform code
4. Web page URL that links to the vendor’s site that describes the new 
platform.

Ari

Original comment by benb...@gmail.com on 11 Apr 2011 at 10:55

GoogleCodeExporter commented 8 years ago
cghub is online, merging into ticket 229

Original comment by zack...@gmail.com on 8 Nov 2012 at 10:50