t-neumann / slamdunk

Streamlining SLAM-seq analysis with ultra-high sensitivity
GNU Affero General Public License v3.0
37 stars 22 forks source link

RuntimeError("Error while executing command: \"" + cmd + "\"") #135

Closed aishwarya-gondane closed 4 months ago

aishwarya-gondane commented 11 months ago

Hi,

I am trying to run slamdunk all command on a 64 Gb Ubuntu machine. I have installed SLAM-DUNK via conda and was able to use the same installation for multiple datasets for past couple of years. I have got a strange runtime error, which I have not encountered before for SLAM-DUNK command : slamdunk all -r /media/volume/GRCh38.primary_assembly.genome.fa -b /media/volume/GRCh38_UCSC_3UTR.bed -o 22RV1_SLAM_0h_2 22RV1_SLAM_0h_2_trimmed.fastq.gz -t 4

error:

Running slamDunk map for 1 files (4 threads) Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/myslamdunk/bin/slamdunk", line 10, in sys.exit(run()) File "/home/ubuntu/miniconda3/envs/myslamdunk/lib/python3.9/site-packages/slamdunk/slamdunk.py", line 520, in run runAll(args) File "/home/ubuntu/miniconda3/envs/myslamdunk/lib/python3.9/site-packages/slamdunk/slamdunk.py", line 245, in runAll runMap(tid, bam, referenceFile, n, args.trim5, args.maxPolyA, args.quantseq, args.endtoend, args.topn, sampleInfo, dunkPath, args.skipSAM) File "/home/ubuntu/miniconda3/envs/myslamdunk/lib/python3.9/site-packages/slamdunk/slamdunk.py", line 149, in runMap mapper.Map(inputBAM, referenceFile, outputSAM, getLogFile(outputLOG), quantseqMapping, endtoendMapping, threads=threads, trim5p=trim5p, maxPolyA=maxPolyA, topn=topn, sampleId=tid, sampleName=sampleName, sampleType=sampleType, sampleTime=sampleTime, printOnly=printOnly, verbose=verbose) File "/home/ubuntu/miniconda3/envs/myslamdunk/lib/python3.9/site-packages/slamdunk/dunks/mapper.py", line 104, in Map run("ngm -r " + inputReference + " -q " + inputBAM + " -t " + str(threads) + " " + parameter + " -o " + outputSAM, log, verbose=verbose, dry=printOnly) File "/home/ubuntu/miniconda3/envs/myslamdunk/lib/python3.9/site-packages/slamdunk/utils/misc.py", line 196, in run raise RuntimeError("Error while executing command: \"" + cmd + "\"") RuntimeError: Error while executing command: "ngm -r /media/volume/GRCh38.primary_assembly.genome.fa -q 22RV1_SLAM_0h_2_trimmed.fastq.gz -t 4 --no-progress --slam-seq 2 -5 12 --max-polya 4 -l --rg-id 0 --rg-sm 22RV1_SLAM_0h_2_trimmed.fastq:pulse:0 -o 22RV1_SLAM_0h_2/map/22RV1_SLAM_0h_2_trimmed.fastq_slamdunk_mapped.sam"

Available space on machine: 122 Gb

t-neumann commented 11 months ago

Can u let me know if there is any content in the log files in the map subfolder?

aishwarya-gondane commented 11 months ago

Can u let me know if there is any content in the log files in the map subfolder?

Hi Tobias,

Thank you so much for such a prompt reply! I have copied the error from the log file.

b'[MAIN] NextGenMap 0.5.5\n'b'[MAIN] Startup : x64 (build Jul 3 2020 02:47:43)\n'b'[MAIN] Starting time: 2023-10-02.08:29:50\n'b'[CONFIG] Parameter: --affine 0 --argos_min_score 0 --bin_size 2 --block_multiplier 2 --broken_pairs 0 --bs_cutoff 6 --bs_mapping 0 --cpu_threads 1 --dualstrand 1 --fast 0 --fast_pairing 0 --force_rlength_check 0 --format 1 --gap_extend_penalty 5 --gap_read_penalty 20 --gap_ref_penalty 20 --hard_clip 0 --keep_tags 0 --kmer 13 --kmer_min 0 --kmer_skip 2 --local 1 --match_bonus 10 --match_bonus_tc 2 --match_bonus_tt 10 --max_cmrs 2147483647 --max_equal 1 --max_insert_size 1000 --max_polya 4 --max_read_length 0 --min_identity 0.650000 --min_insert_size 0 --min_mq 0 --min_residues 0.500000 --min_score 0.000000 --mismatch_penalty 15 --mode 0 --no_progress 1 --no_unal 0 --ocl_threads 1 --output 22RV1_SLAM_0h_2/map/22RV1_SLAM_0h_2_trimmed.fastq_slamdunk_mapped.sam --overwrite 1 --pair_score_cutoff 0.900000 --paired 0 --parse_all 1 --pe_delimiter / --qry 22RV1_SLAM_0h_2_trimmed.fastq.gz --qry_count -1 --qry_start 0 --ref /media/volume/GRCh38.primary_assembly.genome.fa --ref_mode -1 --rg_id 0 --rg_sm 22RV1_SLAM_0h_2_trimmed.fastq:pulse:0 --sensitive 0 --silent_clip 0 --skip_mate_check 0 --skip_save 0 --slam_seq 2 --step_count 4 --strata 0 --topn 1 --trim5 12 --update_check 0 --very_fast 0 --very_sensitive 0\n'b'[NGM] Opening for output (SAM): 22RV1_SLAM_0h_2/map/22RV1_SLAM_0h_2_trimmed.fastq_slamdunk_mapped.sam\n'b'[SEQPROV] Reading encoded reference from /media/volume/GRCh38.primary_assembly.genome.fa-enc.2.ngm\n'b'[SEQPROV] Reading 3099 Mbp from disk took 1.27s\n'b'[PREPROCESS] Reading RefTable from /media/volume/GRCh38.primary_assembly.genome.fa-ht-13-2.3.ngm\n'b'[PREPROCESS] Reading from disk took 1.34s\n'b'[PREPROCESS] Max. k-mer frequency set so 895!\n'b'[INPUT] Input is single end data.\n'b'[INPUT] Opening file 22RV1_SLAM_0h_2_trimmed.fastq.gz for reading\n'b'[INPUT] Input is Fastq\n'b'[INPUT] Estimating parameter from data\n'b'[INPUT] Average read length: 63 (min: 63, max: 64)\n'b'[INPUT] Corridor width: 14\n'b'[INPUT] Average kmer hits pro read: 2.798611\n'b'[INPUT] Max possible kmer hit: 17\n'b'[INPUT] Estimated sensitivity: 0.300000\n'b'[INPUT] Estimating parameter took 22.540s\n'b'[INPUT] Input is Fastq\n'b'[OPENCL] Available platforms: 1\n'b'[OPENCL] AMD Accelerated Parallel Processing\n'b'[OPENCL] Selecting OpenCl platform: AMD Accelerated Parallel Processing\n'b'[OPENCL] Platform: OpenCL 1.2 AMD-APP (1214.3)\n'b'[OPENCL] 1 CPU device found.\n'b'[OPENCL] Device 0: Intel Xeon Processor (Skylake, IBRS) (Driver: 1214.3 (sse2,avx))\n'b'[OPENCL] 8 CPU cores available.\n'b'[FILTER] Could not decode reference for alignment (read: SRR12516336.2.1)\n'b'[FILTER] Read sequence: TGGGGCTGGGGTCCTCCTGNGCTNNNTGNACAAANAAACNTGNGGNNGGAAANAANAAANAAA\n'b'[FILTER] Could not decode reference for alignment (read: SRR12516336.3.1)\n'b'[FILTER] Read sequence: ATGCACATTGAAATAAAATNTTTTNNGTNAGAGANCCAANACNTTNNAAAAANCCNAATNAAA\n'b'[FILTER] Could not decode reference for alignment (read: SRR12516336.3.1)\n'b'[FILTER] Read sequence: ATGCACATTGAAATAAAATNTTTTNNGTNAGAGANCCAANACNTTNNAAAAANCCNAATNAAA\n

t-neumann commented 11 months ago

Hm sounds like a problem of NextGenMap. Can you maybe in parallel grep for the said read sequence (SRR12516336.3.1) in your fastq file and have a look how the corresponding reads look like?

aishwarya-gondane commented 11 months ago

Hi Tobias,

I have tried to pull out the sequence from the fastq file:

@SRR12516336.391999.1 391999 length=75 GCCGATAGTGACTACAAAAAGGATTAGACTGAACCGAATAAAAAAAAAAAAAAAAAAAAAAAAAAAGGAAAAAAA +SRR12516336.391999.1 391999 length=75 ???????????????????????????????????????????????????????????????????????????

Do you think the "?" is causing the problem in the alignment step?

t-neumann commented 11 months ago

Hm are you sure its the same read? Cause it says SRR12516336.391999.1 and not SRR12516336.3.1. Because ? should be a valid qscore