Closed dchakro closed 3 years ago
hi, can you please also send a snapshot of the sam in that region please.
Here is the sam corresponding to this read. As these are pacbio reads spanning the entire cDNA, on using samtools view <region>
I still get the entire file.
I can give more reads as "control", if required.
what version are you using ? (java -jar ${JVARKIT_DIST}/sam2tsv.jar --version
) I don't get the same result as you.
m64145_210904_052513/66383/ccs 0 60 ERBB2 574 T ~ 458 T EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 575 C ~ 459 C EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 576 T ~ 460 T EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 577 T ~ 461 T EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 578 G ~ 462 G EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 579 A ~ 463 A EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 580 T ~ 464 T EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 581 C ~ 465 C EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 582 C ~ 466 C EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 583 A ~ 467 A EQ
m64145_210904_052513/66383/ccs 0 60 ERBB2 584 G ~ 468 G EQ
That is fascinating ! Those are the exact numbers it should output!
For me the version says It says b4af26853
which seems to be the hash for the latest commit, should I be using the 2021.07.01 release instead ?
What version are you using ? It seems that version does it right 👍🏼
JRE: openjdk version 1.8.0_292 on Mac OS Catalina 10.15.7
JRE: openjdk version 1.8.0_292 on Mac OS Catalina 10.15.7 Looks like I had an ancient version of java. As stated on the manual page for sam2tsv, I installed openjdk version "11.0.12" 2021-07-20
However, the results are still incorrect.
I'm working with b4af26853 here. Can you let me know the version you are using @lindenb ?
I'm using 1b3e21985b3afdd9d0a28545c29b3e5aacf53d7b (dev branch) but I think there is no major difference about the two versions for this tool.
I'm using 1b3e219 (dev branch) but I think there is no major difference about the two versions for this tool.
Thank you Pierre! Out of curiosity. what version of java are you running ?
java 1.8 but it's unrelated to the problem.
I pushed a new release, can you please try with that one : https://github.com/lindenb/jvarkit/releases/tag/2021.10.13
Thank You!.. Trying it right away
Pierre, it is working now (and works for my other reference sequences as well)! Steps I took to make it work.
picard
) and samtools faidx
the reference.fa/dist
) and re-build with ./gradlew sam2tsv
Curiously I tried with the "old" reference-dictionary-index triad... And I get the same incorrect results. Re-created the dictionary with picard
and now it works exactly as expected!
Concluded root cause:
The dictionary built (with picard
) using the older version of java was the culprit.
We can close this issue now. I will close the entry on biostars.
Thank you for this nifty piece of software !!
Hi,
Duplicate of https://www.biostars.org/p/9493154/
Can anyone help me resolve an issue I'm having with sam2tsv. It is a nifty piece of software but I have been encountering issues with it regarding the numbering of nucleotides it shows for the reference sequence.
Here's what sam2tsv tells me:
The nucleotide string marked in red
CTGGCCGAGCTAG
is the read reporting the mutation T>A (line #469). But the reference sequence listed by sam2tsv (green box) doesn't match the read sequence at all. In fact it is correct sequence from the reference fasta, but it is right-shifted by16
bases. With other reference files, this number is different, e.g. 20.In fact, if I search the sequence
ATGGAGACCCGCT
in my reference sequence it spans356-368
. In contrast sam2tsv lists these residues in the range339-352
(as seen in the green box). Off by exactly 16 bp.To summarize: 1) The position listed in the green box (reference), corresponds to the sequence in the red box (read). ✅ 2) The reference sequence listed in the green box, is off by 16 bases. ❌
Information: PacBio HiFi CCS reads aligned using
pbmm2
to custom reference (cDNA).Files:
Reference.fa.
relevant sam2tsv output
samfile with reads
Command used: