Open ckuenne opened 3 years ago
Dear ckuenne,
Thank you for your feedback and detailed problem description!
NullPointerException The SAM Output that you provided, suggests that STAR does not adhere to recommended practices for SAM format https://samtools.github.io/hts-specs/SAMtags.pdf The required "NM" tag is not present in the SAM file. According to STAR manual (https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf): [...] The SAM attributes can be specified by the user using --outSAMattributes A1 A2 A3 ... option which accept a list of 2-character SAM attributes. The implemented attributes are: NH HI NM MD AS nM jM jI XS. By default, STAR outputs NH HI AS nM attributes. [...] I am afraid, you will need to rerun STAR with "--outSAMattributes NM NH [...and what ever you need...]"
STAR output has the nonstandard field "nM". From SAMtags.pdf: [...] Note that tags starting with ‘X’, ‘Y’, or ‘Z’ and tags containing lowercase letters in either position are reserved for local use and will not be formally defined in any future version of this [...]
JACUSA_to_TRIBE.pl The person responsible for this PERL script is on vacation... as soon as he is back, I'll ask him to provide feedback. There are some workflows on https://dieterich-lab.github.io/JACUSA2helper/ (Articles). In general, we aim to add "useful" methods to JACUSA2helper.
Dear piechottam,
Thanks for the quick response! I have rerun STAR with "--outSAMattributes NH HI AS nM NM MD" and that seems to have solved the issue. JACUSA2 is currently running with -B A2G. I might be back with more feedback once that is done.
And maybe it would make sense to implement a small check for the BAM format in JACUSA. STAR is a pretty standard tool to use after all. And the default exception thrown is not really helpful to diagnose this.
another question to JACUSA_to_TRIBE.pl: does it only work for stranded_forward or also for stranded_reverse libraries? if i understood correctly, jacusa already takes care of that using -P FR-SECONDSTRAND / RF-FIRSTSTRAND = the jacusa output is relative to the original RNA/DNA, not to the sequencing. so the perl script does not have deal with strandedness again?
Is the JACUSA_to_TRIBE.pl script found here (https://github.com/dieterich-lab/tribe-workflow/tree/master/RRD_workflow) still working for the JACUSA2 output format? Or is that functionality supposed to be superseded by JACUSA2helper? If so, can you suggest a workflow? Since TRIBE was one of the original selling points for JACUSA, an automation or tutorial for this specific pipeline might be of wider interest.
OR should we rather use JACUSA2 directly with -B A2G? I do have stranded paired-end reads, so that should theoretically work, right? But there I run into an error (it works without -B):
It was mapped with STAR 2.7.9a (including the MD field) and the offending read pair is this one:
I tried different Java versions: oracle_jre_1.8.0_211, open_jdk_15.0.2, open_jdk_16.0.1.