Nextomics / nextsv

MIT License
33 stars 8 forks source link

NextSV No Output #17

Open 543090lee opened 1 year ago

543090lee commented 1 year ago

After successfully running the python command to get the output directory and work.sh, I ran the work.sh file. However, I got this output. running script 1 input file(s): output prefix: /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/clean_reads/nextsv_output processing input file: /u/home/m/mdistler/project-zarlab/mouseBAM/A_J.chr19.fastq number of clean reads: 29888081 number of filtered reads: 0 1 input file(s): /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/clean_reads/nextsv_output.clean.fastq output directory: /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/longread_qc /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/clean_reads/nextsv_output.clean.fastq.gz exists -- overwrite (y/n)? y

real 3m11.933s user 9m48.323s sys 0m18.873s first [M::mm_idx_gen::3.4670.88] collected minimizers [M::mm_idx_gen::4.0681.14] sorted minimizers [M::main::4.0681.14] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::4.2941.13] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::4.444*1.13] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573

real 0m35.122s user 1m39.821s sys 0m13.164s

real 0m0.011s user 0m0.004s sys 0m0.005s

real 0m0.006s user 0m0.001s sys 0m0.002s Traceback (most recent call last): File "/u/scratch/m/mdistler/nextsv/bin/check_bam_and_remove_file.py", line 32, in main() File "/u/scratch/m/mdistler/nextsv/bin/check_bam_and_remove_file.py", line 23, in main ret = subprocess.check_output([samtools, "quickcheck", "-vvv", out_bam_file], stderr=subprocess.STDOUT).decode() File "/u/home/m/mdistler/project-jflint/anaconda3/lib/python3.8/subprocess.py", line 411, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/u/home/m/mdistler/project-jflint/anaconda3/lib/python3.8/subprocess.py", line 512, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/u/home/m/mdistler/hap.py-install/bin/samtools', 'quickcheck', '-vvv', '/u/home/m/mdistler/project-zarlab/consensus/nextsv_output/minimap2_bam/nextsv_output.minimap2.sorted.bam']' returned non-zero exit status 8.

real 0m35.400s user 1m39.859s sys 0m13.215s second /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/sniffles_calls/sniffles.minimap2.nextsv_output.sh: line 3: /u/project/jflint/mdistler/anaconda3/lib/python3.8/site-packages/sniffles: Is a directory

real 0m0.003s user 0m0.000s sys 0m0.001s

real 0m0.041s user 0m0.001s sys 0m0.005s third

real 0m0.074s user 0m0.001s sys 0m0.020s ngmlr 0.2.7 (build: Jun 25 2018 10:17:59, start: 2022-10-11.16:21:35) Contact: philipp.rescheneder@univie.ac.at Writing output (SAM) to stdout Reading encoded reference from /u/home/m/mdistler/project-zarlab/mouseBAM/chr19_new.fa-enc.2.ngm Reading 61 Mbp from disk took 0.39s Reading reference index from /u/home/m/mdistler/project-zarlab/mouseBAM/chr19_new.fa-ht-13-2.2.ngm Reading from disk took 4.20s Opening query file /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/clean_reads/nextsv_output.clean.fastq.gz Mapping reads... Exception bad_alloc occured in thread 5. This usually means you ran out of physical or virtual memory (try ulimit -v) Terminating

real 0m4.695s user 0m0.119s sys 0m0.594s

real 0m0.019s user 0m0.001s sys 0m0.008s

real 0m0.010s user 0m0.000s sys 0m0.004s

real 0m4.992s user 0m0.163s sys 0m0.636s /u/home/m/mdistler/project-zarlab/consensus/nextsv_output/sniffles_calls/sniffles.ngmlr.nextsv_output.sh: line 3: /u/project/jflint/mdistler/anaconda3/lib/python3.8/site-packages/sniffles: Is a directory

real 0m0.004s user 0m0.001s sys 0m0.000s

real 0m0.047s user 0m0.002s sys 0m0.003s

real 0m0.956s user 0m0.793s sys 0m0.084s end

I don't see any VCF files in sniffles_call, neither any output in nextsv_results folder.

Thank you for your help

543090lee commented 1 year ago

I have looked through each line of work.sh and came to a conclusion that /u/home/m/mdistler/project-zarlab/consensus/nextsv_output2/minimap2_bam/minimap2_bam2cram.sh is causing a problem, and this error just descends down. I have converted our BAM file to CRAM format. I used this command: /u/home/m/mdistler/scratch/samtools-1.3/samtools view -T /u/home/m/mdistler/project-zarlab/mouseBAM/chr19_new.fa -C -o test.cram nextsv_output2.minimap2.bam. So this means that there are no errors in bam file

fangli80 commented 1 year ago

Hello @543090lee, Sorry for the inconvenience. Could you please share your command for running NextSV?

Best, Li

kaichop commented 1 year ago

As I mentioned in email, samtools 1.3 does not support -T argument

On Tuesday, October 18, 2022, Li Fang @.***> wrote:

Hello @543090lee https://github.com/543090lee, Sorry for the inconvenience. Could you please share your command for running NextSV?

Best, Li

— Reply to this email directly, view it on GitHub https://github.com/Nextomics/nextsv/issues/17#issuecomment-1281875404, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OBP4GFUVISDEG36FK3WDY7XLANCNFSM6AAAAAARDP6AMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

543090lee commented 1 year ago

I used the command: python nextsv2.py nextsv.cfg nextsv_output /u/home/m/mdistler/project-zarlab/mouseBAM

543090lee commented 1 year ago

Upgrading to samtools1.16.1 creates a new error message

Here are the commands I used:

python nextsv2.py nextsv.cfg nextsv_output /u/home/m/mdistler/project-zarlab/mouseBAM ./work.sh

Here is the error message. There seems to be an issue with the header of the bam, so samtools is unable to index it. (in bold below)

(base) bash-4.2$ ./work.sh 1 input file(s): output prefix: /u/home/m/mdistler/scratch/output1/clean_reads/output1 processing input file: /u/home/m/mdistler/project-zarlab/mouseBAM/A_J.chr19.fastq number of clean reads: 29888081 number of filtered reads: 0 1 input file(s): /u/home/m/mdistler/scratch/output1/clean_reads/output1.clean.fastq output directory: /u/home/m/mdistler/scratch/output1/longread_qc

real 3m33.274s user 9m27.823s sys 0m31.683s [M::mm_idx_gen::3.5940.76] collected minimizers [M::mm_idx_gen::4.1511.06] sorted minimizers [M::main::4.1511.06] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::4.3611.06] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::4.494*1.06] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573 [main_samview] fail to read the header from "-".

real 0m24.696s user 1m16.162s sys 0m7.773s samtools sort: failed to read header from "/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.bam"

real 0m0.091s user 0m0.005s sys 0m0.011s [E::hts_open_format] Failed to open file "/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam" : No such file or directory samtools index: failed to open "/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam": No such file or directory

real 0m0.009s user 0m0.003s sys 0m0.005s Traceback (most recent call last): File "/u/scratch/m/mdistler/nextsv/bin/check_bam_and_remove_file.py", line 32, in main() File "/u/scratch/m/mdistler/nextsv/bin/check_bam_and_remove_file.py", line 23, in main ret = subprocess.check_output([samtools, "quickcheck", "-vvv", out_bam_file], stderr=subprocess.STDOUT).decode() File "/u/home/m/mdistler/project-jflint/anaconda3/lib/python3.8/subprocess.py", line 411, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/u/home/m/mdistler/project-jflint/anaconda3/lib/python3.8/subprocess.py", line 512, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/u/home/m/mdistler/scratch/samtools-1.16.1/samtools', 'quickcheck', '-vvv', '/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam']' returned non-zero exit status 2.

real 0m25.027s user 1m16.211s sys 0m7.822s /u/home/m/mdistler/scratch/output1/sniffles_calls/sniffles.minimap2.output1.sh: line 3: /u/project/jflint/mdistler/anaconda3/lib/python3.8/site-packages/sniffles: Is a directory

real 0m0.002s user 0m0.001s sys 0m0.000s

real 0m0.009s user 0m0.001s sys 0m0.004s [E::hts_open_format] Failed to open file "/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam" : No such file or directory samtools view: failed to open "/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam" for reading: No such file or directory samtools index: "/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.cram" is in a format that cannot be usefully indexed rm: cannot remove '/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam': No such file or directory rm: cannot remove '/u/home/m/mdistler/scratch/output1/minimap2_bam/output1.minimap2.sorted.bam.bai': No such file or directory

real 0m0.040s user 0m0.009s sys 0m0.018s ngmlr 0.2.7 (build: Jun 25 2018 10:17:59, start: 2022-10-18.16:53:36) Contact: philipp.rescheneder@univie.ac.at Writing output (SAM) to stdout Reading encoded reference from /u/home/m/mdistler/project-zarlab/mouseBAM/chr19_new.fa-enc.2.ngm Reading 61 Mbp from disk took 0.35s Reading reference index from /u/home/m/mdistler/project-zarlab/mouseBAM/chr19_new.fa-ht-13-2.2.ngm Reading from disk took 3.73s Opening query file /u/home/m/mdistler/scratch/output1/clean_reads/output1.clean.fastq.gz Mapping reads... Exception bad_alloc occured in thread 5. This usually means you ran out of physical or virtual memory (try ulimit -v) Terminating

real 0m4.173s user 0m0.112s sys 0m0.527s samtools sort: couldn't allocate memory for bam_mem

real 0m0.013s user 0m0.003s sys 0m0.008s [E::hts_open_format] Failed to open file "/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam" : No such file or directory samtools index: failed to open "/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam": No such file or directory

real 0m0.009s user 0m0.002s sys 0m0.006s Traceback (most recent call last): File "/u/scratch/m/mdistler/nextsv/bin/check_bam_and_remove_file.py", line 32, in main() File "/u/scratch/m/mdistler/nextsv/bin/check_bam_and_remove_file.py", line 23, in main ret = subprocess.check_output([samtools, "quickcheck", "-vvv", out_bam_file], stderr=subprocess.STDOUT).decode() File "/u/home/m/mdistler/project-jflint/anaconda3/lib/python3.8/subprocess.py", line 411, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/u/home/m/mdistler/project-jflint/anaconda3/lib/python3.8/subprocess.py", line 512, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/u/home/m/mdistler/scratch/samtools-1.16.1/samtools', 'quickcheck', '-vvv', '/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam']' returned non-zero exit status 2.

real 0m4.326s user 0m0.157s sys 0m0.570s /u/home/m/mdistler/scratch/output1/sniffles_calls/sniffles.ngmlr.output1.sh: line 3: /u/project/jflint/mdistler/anaconda3/lib/python3.8/site-packages/sniffles: Is a directory

real 0m0.001s user 0m0.000s sys 0m0.001s

real 0m0.007s user 0m0.001s sys 0m0.004s [E::hts_open_format] Failed to open file "/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam" : No such file or directory samtools view: failed to open "/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam" for reading: No such file or directory samtools index: "/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.cram" is in a format that cannot be usefully indexed rm: cannot remove '/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam': No such file or directory rm: cannot remove '/u/home/m/mdistler/scratch/output1/ngmlr_bam/output1.ngmlr.sorted.bam.bai': No such file or directory

real 0m0.030s user 0m0.005s sys 0m0.022s

Thanks!

fangli80 commented 1 year ago

The reason might be that some depending tools updated their commands.
I am wring a new version of NextSV and will update soon.

fangli80 commented 1 year ago

Hello @543090lee

I updated NextSV to be compatible with the latest version of Sniffles v2 and also added a new SV caller (cuteSV) to the pipeline. Please follow the instructions in the README.md file to install NextSV3.

If you still have issues, please feel free to let me know.

Thanks ! Li

fangli80 commented 1 year ago

@543090lee By the way, I'm assuming your data is ONT. Is that correct?

543090lee commented 1 year ago

Hello, Thank you for updating it to the new version nextsv3. I downloaded and ran the work.sh and got following error.

image image
fangli80 commented 1 year ago

It seems that there are issues with your reference genome, because minimap2 reported the following:

loaded/built the index for 0 target sequence(s)
distinct minimizers: 0 (-nan% are singletons); average occurrences: -nan; average spacing: -nan; total length: 0

Therefore, the alignment process failed.
Could you please check if the reference file (-r) is correct? It should be a FASTA file (not a fasta.fai file for example).

543090lee commented 1 year ago

Hello, I used a .fa file for the reference file, which uses a fasta formatting

Sincerely, Seungmo Lee

fangli80 commented 1 year ago

OK. There will be a run_minimap2.sample_name.sh script in the 2_aligned_bam folder. Could you please run it line-by-line and see what's happening.

543090lee commented 1 year ago

This is what I got.

image
fangli80 commented 1 year ago

minimap2 is not found. Please run it in the conda environment of nextsv.

543090lee commented 1 year ago
image

Here is with error running it on environment

fangli80 commented 1 year ago

There are two issues: 1) the memory is not enough as samtools reported couldn't allocate memory for bam_mem. 2) there is no valid sequence in your reference fasta file as minimap2 still reported loaded/built the index for 0 target sequence(s). Could you please use head -20 path/to/reference.fasta to show the first 20 lines of your file and let's see if we can figure it out.

fangli80 commented 1 year ago

You can reduce the --threads to 2 (the current value is 8) which might save some memory. It's best you can run it in a machine with at least 16G memory. You can use free -g to check the memory of your machine.

543090lee commented 1 year ago

When print first 20 lines, I get this (base) [seichang@login3 nextsv_input]$ head -20 chr19_new.fa

19 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

fangli80 commented 1 year ago

The first 20 lines are correct. I know this might be a modified reference genome. How about using the original reference genome to test the pipeline? If possible, you may also gzip the chr19_new.fa file and email it to me (fangli2718@gmail.com). Thanks. Li

543090lee commented 1 year ago

Hello, I successfully installed and ran nextsv, and got the vcf files. However, sniffles and cuteSV vcf files look empty.

source=Sniffles2_2.0.7

command="/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/sniffles --output-rnames --allow-overwrite --input /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam --vcf /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/sample_name.minimap2.sniffles.vcf --reference /u/home/s/seichang/nextsv_input/chr19_new.fa --threads 2"

fileDate="2022/11/01 13:16:08"

contig=

ALT=

ALT=

ALT=

ALT=

ALT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE

This is the sample vcf file I got. Thank you

fangli80 commented 1 year ago

Hello, Could you please show me: 1) the size of bam file in the 2_aligned_bam folder 2) the shell scripts in 2_aligned_bam and 3_SV_calls

Thanks! Li

543090lee commented 1 year ago

Hello, 1. total 5.4G -rw-r--r--. 1 seichang eeskin 966 Nov 1 12:42 run_minimap2.sample_name.sh -rw-r--r--. 1 seichang eeskin 2.7G Nov 1 13:12 sample_name.minimap2.bam -rw-r--r--. 1 seichang eeskin 2.7G Nov 1 13:15 sample_name.minimap2.sorted.bam -rw-r--r--. 1 seichang eeskin 172K Nov 1 13:16 sample_name.minimap2.sorted.bam.bai

2. .sh file in 2_aligned_bam:

!/bin/bash

minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fastq.gz /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz | samtools view -@ 2 -bS - > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.bam samtools sort -@ 2 -o /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.bam samtools index -@ 2 /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam /u/project/zarlab/seichang/nextsv/bin/check_bam_and_remove_file.py /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.bam samtools

.sh file in 3_SV_calls: cuteSV:

!/bin/bash

mkdir -p /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/cuteSV_temp cuteSV --max_cluster_bias_INS 100 --diff_ratio_merging_INS 0.3 --max_cluster_bias_DEL 100 --diff_ratio_merging_DEL 0.3 --report_readid --genotype --min_support 5 --threads 2 --sample sample_name /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/sample_name.minimap2.cuteSV.vcf /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/cuteSV_temp rm -r /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/cuteSV_temp

sniffles:

!/bin/bash

sniffles --output-rnames --allow-overwrite --input /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam --vcf /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/sample_name.minimap2.sniffles.vcf --reference /u/home/s/seichang/nextsv_input/chr19_new.fa --threads 2

Thank you for your help

fangli80 commented 1 year ago

The file sizes look correct. Could you please run the two shell scripts manually and see if any errors are reported? Thanks, Li

543090lee commented 1 year ago

For the sh file in 2_aligned_bam:

(nextsv3) [seichang@login1 2_aligned_bam]$ ./run_minimap2.sample_name.sh [M::mm_idx_gen::2.6790.96] collected minimizers [M::mm_idx_gen::3.2421.13] sorted minimizers [M::main::3.2421.13] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::3.4321.12] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::3.5661.12] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573; total length: 61431566 [main_samview] fail to read the header from "-". samtools sort: failed to read header from "/u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.bam" Traceback (most recent call last): File "/u/project/zarlab/seichang/nextsv/bin/check_bam_and_remove_file.py", line 32, in main() File "/u/project/zarlab/seichang/nextsv/bin/check_bam_and_remove_file.py", line 23, in main ret = subprocess.check_output([samtools, "quickcheck", "-vvv", out_bam_file], stderr=subprocess.STDOUT).decode() File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/lib/python3.8/subprocess.py", line 415, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/lib/python3.8/subprocess.py", line 858, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '/u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/samtools'

sh file in the 3_SV_calls folder:

(nextsv3) [seichang@login1 3_SV_calls]$ ./run_cuteSV_forminimap2.sample_name.sh 2022-11-03 21:42:34,379 [INFO] Running /u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/cuteSV --max_cluster_bias_INS 100 --diff_ratio_merging_INS 0.3 --max_cluster_bias_DEL 100 --diff_ratio_merging_DEL 0.3 --report_readid --genotype --min_support 5 --threads 2 --sample sample_name /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/sample_name.minimap2.cuteSV.vcf /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/cuteSV_temp Traceback (most recent call last): File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/cuteSV", line 876, in run(sys.argv[1:]) File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/cuteSV", line 872, in run main_ctrl(args, argv) File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/cuteSV", line 673, in main_ctrl samfile = pysam.AlignmentFile(args.input) File "pysam/libcalignmentfile.pyx", line 751, in pysam.libcalignmentfile.AlignmentFile.cinit File "pysam/libcalignmentfile.pyx", line 1000, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False

For Sniffles: (nextsv3) [seichang@login1 3_SV_calls]$ ./run_sniffles_forminimap2.sample_name.sh Running Sniffles2, build 2.0.7 Run Mode: call_sample Start on: 2022/11/03 21:43:18 Working dir: /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls Used command: /u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/sniffles --output-rnames --allow-overwrite --input /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam --vcf /u/project/zarlab/seichang/nextsv/nextsv_output/3_SV_calls/sample_name.minimap2.sniffles.vcf --reference /u/home/s/seichang/nextsv_input/chr19_new.fa --threads 2

Opening for reading: /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sorted.bam Traceback (most recent call last): File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/sniffles", line 616, in Sniffles2_Main(config.from_cmdline(),processes) File "/u/home/s/seichang/project-zarlab/anaconda3/envs/nextsv3/bin/sniffles", line 118, in Sniffles2_Main bam_in=pysam.AlignmentFile(config.input, config.input_mode) File "pysam/libcalignmentfile.pyx", line 751, in pysam.libcalignmentfile.AlignmentFile.cinit File "pysam/libcalignmentfile.pyx", line 1000, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False

fangli80 commented 1 year ago

It looks like minimap2 ran out of memory. Could you please run the following command and see what happened? minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fastq.gz /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam

543090lee commented 1 year ago

Hello, (base) [seichang@login4 2_aligned_bam]$ minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fastq.gz /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam -bash: minimap2: command not found

I got this error, and minimap2 wasn't found?

Thank you

fangli80 commented 1 year ago

Please run it in the conda environment of nextsv.

543090lee commented 1 year ago

Hello, I got this result:

(nextsv3) [seichang@login3 2_aligned_bam]$ minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fastq.gz /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam [M::mm_idx_gen::2.7280.93] collected minimizers [M::mm_idx_gen::3.2791.11] sorted minimizers [M::main::3.2801.11] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::3.4641.10] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::3.592*1.10] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573; total length: 61431566 Segmentation fault

Thank you for your help

fangli80 commented 1 year ago

ok. Please try the following command in the conda environment of nextsv 1) minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fastq.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam

2) minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam

543090lee commented 1 year ago

Hello, I got these results:

1.(nextsv3) [seichang@login1 2_aligned_bam]$ minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fastq.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam

[M::mm_idx_gen::2.6600.98] collected minimizers [M::mm_idx_gen::3.1941.15] sorted minimizers [M::main::3.1941.15] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::3.3771.14] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::3.504*1.14] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573; total length: 61431566 Segmentation fault

2.(nextsv3) [seichang@login1 2_aligned_bam]$ minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam [M::mm_idx_gen::2.4610.99] collected minimizers [M::mm_idx_gen::3.0041.17] sorted minimizers [M::main::3.0051.17] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::3.1921.16] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::3.3211.15] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573; total length: 61431566 [M::worker_pipeline::504.1621.00] mapped 1 sequences [M::main] Version: 2.24-r1122 [M::main] CMD: minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/project/zarlab/seichang/nextsv/nextsv_output/1_clean_reads/sample_name.clean.fasta.gz [M::main] Real time: 504.299 sec; CPU: 502.375 sec; Peak RSS: 3.100 GB

fangli80 commented 1 year ago

1) There are both fastq and fasta files in your input folder. Please make sure your reference fasta file is NOT in the input folder. Otherwise it will be regarded as input reads as well. 2) the alignment of the fastq file failed. Please run the following command and see what happened.

$FASTQ=/path/to/your/input/fastq
minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa $FASTQ > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam
543090lee commented 1 year ago

Hello, This command uses a .fa file in the nextsv input folder, so I ran the above command, and removed .fa file. Here is the result: (nextsv3) [seichang@login4 2_aligned_bam]$ minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa $FASTQ > /u/project/zarlab/seichang/nextsv/nextsv_output/2_aligned_bam/sample_name.minimap2.sam [M::mm_idx_gen::2.5570.94] collected minimizers [M::mm_idx_gen::3.1051.13] sorted minimizers [M::main::3.1061.12] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::3.3121.12] mid_occ = 133 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::3.454*1.11] distinct minimizers: 8335668 (88.18% are singletons); average occurrences: 1.322; average spacing: 5.573; total length: 61431566 [M::main] Version: 2.24-r1122 [M::main] CMD: minimap2 --MD -t 2 -ax map-ont -N 10 /u/home/s/seichang/nextsv_input/chr19_new.fa /u/home/s/seichang/nextsv_input [M::main] Real time: 3.484 sec; CPU: 3.866 sec; Peak RSS: 0.519 GB

Thank you

543090lee commented 1 year ago

Also do you want me to remove the reference fasta file in the input folder when running the nextsv3.py or for running individual commands in the sh scripts?

Thank you