Closed jamesc99 closed 2 weeks ago
Hi Thanks for reporting this issue.
I should first note that we strongly recommend pbmm2 for all sawfish analysis, as discussed in a bit more detail here:
https://github.com/PacificBiosciences/sawfish/blob/main/docs/user_guide.md#read-mapper
It may be possible to use minimap2 with the appropriate flags to prevent hard-clipping but this is untested/unsupported.
The good news for this specific issue with the rq tag is we've just released an update (0.12.2) yesterday to relax this requirement, so a version update should take care of this specific error.
I am also experiencing the same error while using this bam file: https://s3-us-west-2.amazonaws.com/human-pangenomics/working/HPRC_PLUS/HG002/analysis/aligned_reads/hifi/GRCh38/HG002_aligned_GRCh38_winnowmap.sorted.bam.
I just runned version 0.12.2, but get new errors:
thread '' panicked at src/refine_sv/trimmed_reads.rs:197:5: assertion failed: !is_hard_clipped(&record.cigar())
Below is the full running log:
[2024-08-25][09:31:28][sawfish][INFO] Starting sawfish
[2024-08-25][09:31:28][sawfish][INFO] cmdline: /home/xiz978/bin/BCH/sawfish-v0.12.2-x86_64-unknown-linux-gnu/bin/sawfish discover --threads 16 --ref /n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta --bam /n/storage_test/xiz978/Wangziying/HG002/hifi/HG002_aligned_GRCh38_winnowmap.sorted.bam --output-dir HG002_discover_dir
[2024-08-25][09:31:28][sawfish][INFO] Running on 16 threads
[2024-08-25][09:31:29][sawfish][INFO] Writing discover settings to file: 'HG002_discover_dir/discover.settings.json'
[2024-08-25][09:31:29][sawfish][INFO] Reading reference genome from file '/n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta'
[2024-08-25][09:31:58][sawfish][INFO] Processing alignment file '/n/storage_test/xiz978/Wangziying/HG002/hifi/HG002_aligned_GRCh38_winnowmap.sorted.bam'
[2024-08-25][09:36:58][sawfish][INFO] Processed alignments on 1,181,630 of 3,099,922 ref genome kb (38%)
[2024-08-25][09:42:05][sawfish][INFO] Processed alignments on 2,595,202 of 3,099,922 ref genome kb (83%)
[2024-08-25][09:43:24][sawfish][INFO] Finished processing all alignments
[2024-08-25][09:43:24][sawfish][INFO] Writing max sv depth genome regions to bed file: 'HG002_discover_dir/max.depth.bed'
[2024-08-25][09:43:24][sawfish][INFO] Getting depth bin GC content
[2024-08-25][09:43:27][sawfish][INFO] Getting GC depth correction levels
[2024-08-25][09:43:27][sawfish][INFO] Completing breakpoint observation clustering
[2024-08-25][09:43:28][sawfish][INFO] Finished breakpoint observation clustering
[2024-08-25][09:43:28][sawfish][INFO] Consolidating adjacent indel candidates
[2024-08-25][09:43:28][sawfish][INFO] Finding large insertion candidates
[2024-08-25][09:43:28][sawfish][INFO] Writing breakpoint_cluster debug bed file: 'HG002_discover_dir/debug.breakpoint_clusters.bed'
[2024-08-25][09:43:28][sawfish][INFO] Refining SV Candidates
[2024-08-25][09:43:28][sawfish][INFO] Writing debug assembly region bed file: 'HG002_discover_dir/assembly.regions.bed'
[2024-08-25][09:43:28][sawfish][INFO] Starting refinement of breakpoint evidence clusters
thread '<unnamed>' panicked at src/refine_sv/trimmed_reads.rs:197:5:
assertion failed: !is_hard_clipped(&record.cigar())
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
/var/spool/slurmd/job45032830/slurm_script: line 9: 17052 Aborted /home/xiz978/bin/BCH/sawfish-v0.12.2-x86_64-unknown-linux-gnu/bin/sawfish discover --threads 16 --ref /n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta --bam /n/storage_test/xiz978/Wangziying/HG002/hifi/HG002_aligned_GRCh38_winnowmap.sorted.bam --output-dir HG002_discover_dir
[2024-08-25][09:43:41][sawfish][INFO] Starting sawfish
[2024-08-25][09:43:41][sawfish][INFO] cmdline: /home/xiz978/bin/BCH/sawfish-v0.12.2-x86_64-unknown-linux-gnu/bin/sawfish joint-call --threads 16 --sample HG002_discover_dir --output-dir HG002_joint_call_dir
[2024-08-25][09:43:41][sawfish][INFO] Running on 16 threads
[2024-08-25][09:43:41][sawfish][INFO] Reading reference genome from file '/n/data1/bch/genetics/lee/reference/hg38/Homo_sapiens_assembly38.fasta'
[2024-08-25][09:44:10][sawfish][INFO] Reading all sample discovery input
thread '<unnamed>' panicked at /var/lib/bamboo/.cargo/registry/src/index.crates.io-6f17d22bba15001f/unwrap-1.2.1/src/lib.rs:55:25:
Thanks @zengxi-hada,
I will take a note to make a more clear error message. The error indicates that your input bam uses hard-clipping for split reads, which sawfish doesn't support at this time. As enumerated above, we support and test the SV caller for pbmm2 only. It may be possible to get the analysis working with minimap, but at minimum you would need to change the settings to prevent hard-clipping of split reads, and even for that case minimap is not tested so there may be additional complications, I would recommend using pbmm2, some discussion on this in the user guide can be found here:
https://github.com/PacificBiosciences/sawfish/blob/main/docs/user_guide.md#read-mapper
Closing the original issues as answered, tracking msg improvement in #2.
Hi there,
Thanks for developing this tool!
I was trying to run minimap2-aligned BAM files with Sawfish and encountered an error saying 'no rq tag'. I checked PacBio's doc and found it is the tag describing the read expected accuracy.
I am wondering if I can run a minimap2-aligned BAM file on Sawfish, like adding specific parameter, etc.
Thank you! Ryan
Error log file:
codes for minimap2 mapping:
$minimap -Y -ax map-hifi -R "@RG\tSM:${INFOR}\tID:${RG_ID}" $ref38 $FASTQ_FILE --MD -t 8