Open Kaddea opened 2 months ago
Weird. We have processed lots of bams through this type of workflow and I've never seen anything like that. Happy to take a look though. Can you provide a tiny example bam with the steps needed to recreate the problem?
Thanks for your help!!
I've cropped one of the bam files and the corresponding vcf file (both from RNAseq reads) to reproduce the readcount output files.
The strange characters in the output files appear only from column 11 on, and it seems only at sites with varying deletions (2-5 bases).
The files (bam, vep-annotated vcf and the snv/indel tsv) can be downloaded from
https://kaddea.com/s/J76BAJsg4d5zytN (approx. 45 MB)
Sequence alignment and variant analysis based on Ensembl GRCh38, release 110.
Best, Mathias
Thank you. Can you also provide the exact commands that were used, along with software versions, etc - just trying to reproduce it on our end here.
read_count_pipeline.txt Hmmm ... the attached file indicates the steps for alignment, variant calling, annotation and preparation for the read counts. I've omitted the mandatory parameters (like input/output, etc.). Hope it helps ... btw.: truncating the read-count output files to the first 10 columns helps to proceed with the vcf annotation, but I'm not sure about the validity of the resulting files ... Mathias
Hi,
I've using the bam_readcount wrapper "mgibio/bam_readcount_helper-cwl". The output files (snv or indel) contain control characters which cannot be processed by the vcf_readcount_annotator.
Which substitution of the control characters are suitable for further processing?
Variation (vcf) 20 405939 . TTTC T . weak_evidence AS_FilterStatus=weak_evidence;AS_SB_TABLE=0,0|0,0;DP=1;ECNT=1;GERMQ=23;MBQ=0,32;MFRL=0,204;MMQ=60,60;MPOS=43;POPAF=7.3;TLOD=4.21;CSQ=-|upstream_gene_variant|MODIFIER|RBCK1|ENSG00000125826|Transcript|ENST00000356286.10|protein_coding|||||||||||2357|1||HGNC|HGNC:15864|1||| GT:AD:AF:DP:F1R2:F2R1:FAD:SB 0/1:0,1:0.667:1:0,1:0,0:0,1:0,0,1,0
bam_readcount output (indel) 20 405940 N 1 =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 -^@^@^@:1:255.00:0.00:0.00:1:0:0.88:0.03:0.00:1:0.42:101.00:0.42