I think there are way too many irrelevant files on the Download File page, and
it should be organized a little better or cleaned. Probably the simplest way
to organize would be to add new "Advanced File" headings that appear at the
very bottom of the page. Below are my suggestiosn for an organization, charlie
may have modifications later. the last one (008) could be omitted altogether
for normal users.
I also added some descriptions which could be used for tool tips.
Note, i wasn't very careful with my regular expressions, so please double check.
---
001. Unaligned fastq files
s_*_sequence.txt : It would be nice if we added the LIMS id to these filenames
as well.
002. Primary BAM Alignment Files
*%GENOME%.fa.mdups.bam[.bai]? : Main BAM file we want people to use
003. Base quality recalibrated BAM files
*realign.mdups.recal.ba[mi] : Recalibrated by Bis-SNP
004. Visualization tracks
*.BinDepths.metric.wig.bw : Read coverage in all genomic bins of "winsize" base
pairs (UCSC bigwig file)
*.BinDepths.metric.wig : Read coverage in all genomic bins of "winsize" base
pairs (UCSC bigwig file)
*CG.ct_coverage.tdf : Bis-SNP coverage track
*CG.tdf : Bis-SNP methylation track
005. Methylation tracks (Bisulfite-seq)
*CG.6plus2.bed : Bis-SNP methylation calls
*cpg.raw.sort.vcf : Bis-SNP methylation calls plus SNPs
006. Advanced Files, Alternate BAM Alignments
*NC_001416.fa.*bam* : Lambda control alignments (QC only)
007. Advanced Files, QC
nmerCount_* : Overrepresented oligomers
*_adapterTrim.csv
*%GENOME%*.mdups.bam.flagstat.metric.txt: Raw output from SAMTOOLS flagstat
*.CollectAlignmentSummaryMetrics.metric.txt : Raw output from Picard
*.CollectGcBiasMetrics.metric.txt : Raw output from Picard
*.CollectInsertSizeMetrics.metric.txt : Raw output from Picard
008. Intermediate internal pipeline files (not for general use)
*%GENOME%.fa.bam[.bai]? : This file is before mdups addition, shouldn't be
needed
*_Gerald_mononucleotide.csv : By-product of QC metric computation
ContamCheck.* : By-product of QC metric computation
*_qcmetrics.csv : By-product of QC metric computation
*MethLevelAverages.metric.txt : By-product of QC metric computation
*%GENOME%*.bam.flagstat.metric.txt : Based on Pre-mdups BAM. Not useful
*.NC_001416*flagstat.metric.txt : Lambda flagstat, By-product of QC metric
computation
*InvertedReadPairDups.metric.txt : By-product of QC metric computation
*ReadLength.metric.txt : By-product of QC metric computation
*.ApplicationStackMetrics.metric.txt : By-product of QC metric computation
*.CPGvsRandomCov.metric.txt : By-product of QC metric computation
*.MeanQualityByCycle.metric.txt : By-product of QC metric computation
*.QualityScoreDistribution.metric.txt : By-product of QC metric computation
*DownsampleDups* : By-product of QC metric computation
Original issue reported on code.google.com by benb...@gmail.com on 17 Jan 2013 at 6:00
Original issue reported on code.google.com by
benb...@gmail.com
on 17 Jan 2013 at 6:00