rbloom5 / ImmuneRep

1 stars 0 forks source link

Data cleaning #4

Closed rbloom5 closed 9 years ago

rbloom5 commented 9 years ago

added a file QCpipeline that has all the functions we need to download and check the quality of .sra files.

getdata: I modifed this a bit from robby's. I simplified the downloading (fastq-dump will automatically retrieve the sra file from the ftp, so you don't have to download it. And you avoid having to actually store the .sra file this way fastqc just gives you the fastq).
I also took out the sys argv and just made it a normal function. let me know if others prefer the sys argv format, but I figured this was easier.

QCreport generates a qc report from the fastq file.

getdata_QC combines them both into one script - just input the SRR numbers and the directory you want to store them, and it will download and check QC for all of them.

Let me know if you disagree with any of the changes before accepting the pull request and I can modify accordingly.