Closed rcurrie closed 6 years ago
Cool. Has it worked?
On Tue, Jan 9, 2018 at 1:33 PM, Rob Currie notifications@github.com wrote:
@hbeale https://github.com/hbeale I've added basic bam to fastq conversion:
docker run --rm \ -v /mnt/samples:/samples \ quay.io/ucsc_cgl/samtools:1.5--98b58ba05641ee98fa98414ed28b53ac3048bc09 \ fastq -1 /samples/{0}.R1.fq.gz -2 /samples/{0}.R2.fq.gz /samples/{1}
(Same method as used in cgl-rnaseq)
Treeshop will make the conversion and copy the resulting fastq's back to derived under archive for posterity and then proceed with rnaseq etc....
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/UCSC-Treehouse/pipelines/issues/8, or mute the thread https://github.com/notifications/unsubscribe-auth/AADVg_kpOUUNDZfKuzYOtKeav50qS3yOks5tI9s_gaJpZM4RYeQG .
Hmmm...output isn't matching but maybe its my fastq -> bam:
docker run -it --rm -v pwd
/samples:/data broadinstitute/picard FastqToSam F1=/data/TEST_R1.fastq.gz F2=/data/TEST_R2.fastq.gz O=/data/TEST.bam SM=TEST001 RG=rg0000
converting this bam back to fastq via samtools, then through rnaseq, then umend and the readDist.txt differs.
@hbeale is it reasonable that these should be identical:
fastqs -> rnaseq sorted.bam output -> umend
fastqs -> picard bam -> samtools to fastq -> rnaseq sorted.bam -> umend
?
< Total Reads 3416
< Total Tags 4133
< Total Assigned Tags 3922
---
> Total Reads 1626
> Total Tags 2050
> Total Assigned Tags 1978
6,15c6,15
< CDS_Exons 37671772 2792 0.07
< 5'UTR_Exons 18392664 219 0.01
< 3'UTR_Exons 46333687 734 0.02
< Introns 1419121300 155 0.00
< TSS_up_1kb 26926674 2 0.00
< TSS_up_5kb 121398195 9 0.00
< TSS_up_10kb 221886368 18 0.00
< TES_down_1kb 28738628 0 0.00
< TES_down_5kb 125348902 2 0.00
< TES_down_10kb 224262488 4 0.00
---
> CDS_Exons 37671772 1230 0.03
> 5'UTR_Exons 18392664 58 0.00
> 3'UTR_Exons 46333687 486 0.01
> Introns 1419121300 167 0.00
> TSS_up_1kb 26926674 3 0.00
> TSS_up_5kb 121398195 3 0.00
> TSS_up_10kb 221886368 3 0.00
> TES_down_1kb 28738628 8 0.00
> TES_down_5kb 125348902 30 0.00
> TES_down_10kb 224262488 34 0.00
converted bam in the develop branch:
https://github.com/UCSC-Treehouse/pipelines/tree/develop/samples
when you say " samtools to fastq -> umend", does "fastq -> umend" represent the rna-seq pipeline (using STAR) and then the bam-umend-qc process?
yes, i'd expect them to be identical, but i don't know where to go if they're not. I'd approach is by comparing the outputs of these two approaches:
bam -> btfv9 -> groomed fq -> umend bam -> samtools to fastq -> un-groomed fq -> umend
On Tue, Jan 9, 2018 at 3:00 PM, Rob Currie notifications@github.com wrote:
converted bam in the develop branch:
https://github.com/UCSC-Treehouse/pipelines/tree/develop/samples
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/UCSC-Treehouse/pipelines/issues/8#issuecomment-356442539, or mute the thread https://github.com/notifications/unsubscribe-auth/AADVgxF4yloaWepmPXqDHXCs6Vv_gC7cks5tI--WgaJpZM4RYeQG .
@hbeale I've added basic bam to fastq conversion:
(Same method as used in cgl-rnaseq)
Treeshop will make the conversion and copy the resulting fastq's back to derived under archive for posterity and then proceed with rnaseq etc....