Closed jfear closed 6 years ago
Lee provides this data processing information as part of his GEO entry:
We performed base calling using CASAVA version 1.8.2 (Illumina, San Diego, CA).
We mapped RNA-Seq reads to the Drosophila genome assembly (Release 6) with ERCC spike-in sequences (Jiang et al., 2011. PMID: 21816910) using TopHat 2.0.10
(Trapnell et al., 2009. PMID: 19289445). We used -g 1
and -G
parameters. We removed measurements for those problematic controls in our submission. A gene model is required for this setting (-G). We used FlyBase (version 6.06). Only major chromosome arms (chr2L, chr2R, chr3L, chr3R, chr4, chrX, chrY, and chrM) were used for mapping.
Read counts were obtained using HTseq
(Anders et al., 2014. PMID: 25260700) using default parameters. We calculated normalized, gene level, expression using Cufflinks 2.2.1 (Trapnell et al., 2010. PMID: 20436464) that gave us Fragments per Kilobase per Million mapped reads (FPKM) values. -G
, -b
, and -u
parameters were used.
Genome_build: BDGP release 6 (obtained from FlyBase (http://www.flybase.org)
Supplementary_files_format_and_content: Tab limited text files contain transcript abundance information, generated from HTseq and Cufflink. HTseq results contain read counts, and Cufflinks results contain FPKM values at the gene level.
Story
Need to perform a complete differential expression analysis.
Questions and Tasks
Definition of done