Closed zengxi-hada closed 1 week ago
Thanks, I'll refer to the answer for this error on the previous issue:
https://github.com/PacificBiosciences/sawfish/issues/1#issuecomment-2310503108
...I will leave this issue open to track an improved error message update for you.
Hi,
after running Sawfish v0.12.2, the previous error in #1 has been fixed. However, I got another error regarding 'discover' step.
[2024-08-25][11:24:45][sawfish][INFO] Starting sawfish
[2024-08-25][11:24:45][sawfish][INFO] cmdline: sawfish discover --bam /users/u250191/ryan_scratch_ln/benchmark_inv/rawdata/hg002/pacbio/minimap2/sorted_PacBio_HG002_m54238_180916_191625.Q20.fastq.bam --ref /users/u250191/ryan_scratch_ln/reference/human-grch38.fasta
[2024-08-25][11:24:45][sawfish][INFO] Running on 8 threads
[2024-08-25][11:24:45][sawfish][INFO] Writing discover settings to file: 'sawfish_discover_output/discover.settings.json'
[2024-08-25][11:24:45][sawfish][INFO] Reading reference genome from file '/users/u250191/ryan_scratch_ln/reference/human-grch38.fasta'
[2024-08-25][11:25:21][sawfish][INFO] Processing alignment file '/users/u250191/ryan_scratch_ln/benchmark_inv/rawdata/hg002/pacbio/minimap2/sorted_PacBio_HG002_m54238_180916_191625.Q20.fastq.bam'
[2024-08-25][11:30:21][sawfish][INFO] Processed alignments on 3,037,641 of 3,088,286 ref genome kb (98%)
[2024-08-25][11:30:25][sawfish][INFO] Finished processing all alignments
[2024-08-25][11:30:25][sawfish][INFO] Writing max sv depth genome regions to bed file: 'sawfish_discover_output/max.depth.bed'
[2024-08-25][11:30:25][sawfish][INFO] Getting depth bin GC content
[2024-08-25][11:30:29][sawfish][INFO] Getting GC depth correction levels
[2024-08-25][11:30:29][sawfish][INFO] Completing breakpoint observation clustering
[2024-08-25][11:30:30][sawfish][INFO] Finished breakpoint observation clustering
[2024-08-25][11:30:30][sawfish][INFO] Consolidating adjacent indel candidates
[2024-08-25][11:30:30][sawfish][INFO] Finding large insertion candidates
[2024-08-25][11:30:30][sawfish][INFO] Writing breakpoint_cluster debug bed file: 'sawfish_discover_output/debug.breakpoint_clusters.bed'
[2024-08-25][11:30:31][sawfish][INFO] Refining SV Candidates
[2024-08-25][11:30:31][sawfish][INFO] Writing debug assembly region bed file: 'sawfish_discover_output/assembly.regions.bed'
[2024-08-25][11:30:31][sawfish][INFO] Starting refinement of breakpoint evidence clusters
/var/spool/slurm/d/job540156/slurm_script: line 34: 2827442 Illegal instruction (core dumped) sawfish discover --bam $input_bam --ref $ref38
all bam files were aligned by minimap2 with code $minimap -Y -ax map-hifi -R "@RG\tSM:${INFOR}\tID:${RG_ID}" $ref38 $FASTQ_FILE --MD -t 8
. Some of them worked while some of them failed due to error above.
my script to run sawfish is attached below:
#!/bin/bash
#SBATCH --job-name=sawfish
#SBATCH --output=%x_%A_%a.out
#SBATCH --error=%x_%A_%a.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=64gb
#SBATCH --time=72:00:00
#SBATCH --partition=medium
#SBATCH -A proj-fs0002
# Load necessary modules or source conda
source /users/u250191/.bashrc
mamba activate sawfish
# Set reference and work directory
ref38="/users/u250191/ryan_scratch_ln/reference/human-grch38.fasta"
WORK_DIR=$(pwd)
# Set input parameters
input_bam=$1
# Ensure all parameters are provided
if [ -z "$input_bam" ] ; then
echo "Usage: sbatch script.sh <input_bam>"
exit 1
fi
# Step 1: sawfish discover
sawfish discover --bam $input_bam --ref $ref38
# Step 2: sawfish joint-call
sawfish joint-call --sample sawfish_discover_output
I initially thought this was due to a memory issue and I increased it to 144 GB on a 30x coverage BAM file, and sawfish failed again. Because you mentioned the mem per core should be less than 8 gb, in which case I may need instruction from your side.
Thanks again for your time, Best!
Copied this report into a new issue #3
Related to the original issue on this thread, the v0.12.3 release fixes the error to a (non-panic) message describing that hard-clipped input is not currently accepted.
I am running sawfish-v0.12.2 on this bam file https://s3-us-west-2.amazonaws.com/human-pangenomics/working/HPRC_PLUS/HG002/analysis/aligned_reads/hifi/GRCh38/HG002_aligned_GRCh38_winnowmap.sorted.bam using the command line:
sawfish discover --threads 16 --ref Homo_sapiens_assembly38.fasta --bam HG002_aligned_GRCh38_winnowmap.sorted.bam --output-dir HG002_discover_dir
The following error happens:
thread '<unnamed>' panicked at src/refine_sv/trimmed_reads.rs:197:5: assertion failed: !is_hard_clipped(&record.cigar())
Below is the full running log:I am wondering whether this version can be upgraded to bypass this issue?
Thank you.