Closed mattloose closed 1 year ago
Never seen this before, but looks like the error might be inside PEPPER/Margin packages. Do you mind sending hapdup.log
and full terminal output if you happen to have it? Is there anything special about this failing sample, compared to others (unusual coverage, read length, heterozygosity etc)?
I don't believe there to be anything particularly unusual about the sample, but it does have some ultra long read data in it.
The log file is below.
I don't have the full terminal output but here is where it was at before the error:
Phasing 87% complete (8087/9279, 7m 45s). Estimated time remaining: 1m 9s > Phasing 88% complete (8180/9279, 7m 49s). Estimated time remaining: 1m 3s > Phasing 89% complete (8274/9279, 7m 53s). Estimated time remaining: 58s > Phasing 90% complete (8363/9279, 7m 57s). Estimated time remaining: 53s > Phasing 91% complete (8494/9279, 8m 3s). Estimated time remaining: 47s > Phasing 92% complete (8550/9279, 8m 6s). Estimated time remaining: 42s > Phasing 93% complete (8666/9279, 8m 11s). Estimated time remaining: 36s > Phasing 94% complete (8775/9279, 8m 15s). Estimated time remaining: 31s > Phasing 95% complete (8864/9279, 8m 18s). Estimated time remaining: 26s > Phasing 96% complete (8908/9279, 8m 20s). Estimated time remaining: 20s > Phasing 97% complete (9009/9279, 8m 23s). Estimated time remaining: 15s > Phasing 98% complete (9136/9279, 8m 26s). Estimated time remaining: 10s > Phasing 99% complete (9190/9279, 8m 26s). Estimated time remaining: 5s > Starting merge
Expected three tokens in header line, got 2
This usually means you have multiple primary alignments with the same read ID. You can identify whether this is the case with this command:
Thanks. According to this, it might have something to do with the duplicated read ids: https://github.com/kishwarshafin/pepper/issues/67
Could you try command line: samtools view -F 0x900 -q 10 your.bam | cut -f1 | uniq -c | awk '$1 > 1' | head
? If the output is non-empty, there are duplicates. If so, they need to be removed, I don't think there is another workaround in Margin.
Apologies - I've realised this was an error on our side where duplicated reads had got into the assembly pipeline.
Thanks for your help.
I've got one sample in a series which is failing in hapdup with the following error.
Any suggestions?
Thanks
`
Expected three tokens in header line, got 2 This usually means you have multiple primary alignments with the same read ID. You can identify whether this is the case with this command:
[2022-12-13 17:15:57] ERROR: Missing output: hapdup/margin/MARGIN_PHASED.haplotagged.bam Traceback (most recent call last): File "/data/test_data/GIT/hapdup/hapdup.py", line 24, in
sys.exit(main())
File "/data/test_data/GIT/hapdup/hapdup/main.py", line 206, in main
file_check(haplotagged_bam)
File "/data/test_data/GIT/hapdup/hapdup/main.py", line 114, in file_check
raise Exception("Missing output")
Exception: Missing output
`