uio-bmi / NucMerge

Genome assembly quality improvement assisted by alternative assemblies and paired-end Illumina reads
Mozilla Public License 2.0
7 stars 1 forks source link

Program stopped after Bowtie process #2

Open FelipeMelis opened 4 years ago

FelipeMelis commented 4 years ago

Hello, I already clone the repo and run NucMerge with my data with the following command. nucmerge.py --proc 50 Contigs1.fasta Contigs2.fasta fw.fastq rv.fastq TEST

But the program stopped after a few minutes and for hours doesn't change the prompt I'm in Ubuntu 16.04 Ram 125gb, 64bits os and I have all the required Software in my $PATH

Re-opening _in1 and _in2 as input streams
Returning from Ebwt constructor
Headers:
    len: 6286577
    bwtLen: 6286578
    sz: 1571645
    bwtSz: 1571645
    lineRate: 6
    offRate: 4
    offMask: 0xfffffff0
    ftabChars: 10
    eftabLen: 20
    eftabSz: 80
    ftabLen: 1048577
    ftabSz: 4194308
    offsLen: 392912
    offsSz: 1571648
    lineSz: 64
    sideSz: 64
    sideBwtSz: 48
    sideBwtLen: 192
    numSides: 32743
    numLines: 32743
    ebwtTotLen: 2095552
    ebwtTotSz: 2095552
    color: 0
    reverse: 1
Total time for backward call to driver() for mirror index: 00:00:02
908765 reads; of these:
  908765 (100.00%) were unpaired; of these:
    356 (0.04%) aligned 0 times
    881953 (97.05%) aligned exactly 1 time
    26456 (2.91%) aligned >1 times
99.96% overall alignment rate
908765 reads; of these:
  908765 (100.00%) were unpaired; of these:
    9772 (1.08%) aligned 0 times
    867596 (95.47%) aligned exactly 1 time
    31397 (3.45%) aligned >1 times
98.92% overall alignment rate
908765 reads; of these:
  908765 (100.00%) were unpaired; of these:
    12195 (1.34%) aligned 0 times
    871690 (95.92%) aligned exactly 1 time
    24880 (2.74%) aligned >1 times
98.66% overall alignment rate
908765 reads; of these:
  908765 (100.00%) were unpaired; of these:
    21603 (2.38%) aligned 0 times
    857190 (94.32%) aligned exactly 1 time
    29972 (3.30%) aligned >1 times
97.62% overall alignment rate
min_frag_size 36
max_frag_size 1244
read_length 251
min_frag_size 36
max_frag_size 1217
read_length 251

Thanks in advance

kseniakh commented 4 years ago

Hello,

I assume that there is no problems with NucMerge. It shouldn't output many messages. But to be sure, can you check with the "top" command that it is still running and what tool is exactly running?

Ksenia

FelipeMelis commented 4 years ago

the processes are:

bowtie2-build
bowtie2-align-s 
samtools
python Nucbreak

after Nucbreak there is not more processes related to NucMerge.

kseniakh commented 4 years ago

Can you also look what directories were created and which of them are not empty?

FelipeMelis commented 4 years ago

The folders inside the NucMerge output:

NucBreak_1/:
Results

NucBreak_2/:
Results

NucDiff/:

Pilon_1/:
bwa

Pilon_2/:
bwa

Do you need more info about the folders inside?

kseniakh commented 4 years ago

are they all empty or not?

FelipeMelis commented 4 years ago

The folders from NucBreak and Pilon are not empty only the NucDiff one.

kseniakh commented 4 years ago

Do Pilon_1 and Pilon_2 folders have .out and .changes files? Do NucBreak_1 and NucBreak_2 folders have _breakpoints.bedgraph files?

FelipeMelis commented 4 years ago

Pilon_1 and Pilon_2 doesn't have those files (only a bwa folder). NucBreak_1 and NucBreak_2 they have those _breakpoints.bedgraph files.

kseniakh commented 4 years ago

are bwa folders empty?

FelipeMelis commented 4 years ago

No, both Pilon's folders have TEST_2.amb TEST_2.ann TEST_2.bwt TEST_2.pac TEST_2.sa

And also the Bowtie folders from NucBreak contains the bowtie files

kseniakh commented 4 years ago

than it means that bwa has failed and Pilon was not able to do anything. are contigs files large?

FelipeMelis commented 4 years ago

They are contigs from bacterial genomes and have a genome length of ~6000000 (and the file size are like 6mb)

kseniakh commented 4 years ago

From this point it is quite difficult for me to understand what is going on without data on hands and without any error messages. They only that I can propose is to run bwa yourself to see if it produces any error messages.

bwa index -p <work_dir+prefix> bwa mem -o <work_dir+prefix>_all.sam <work_dir+prefix> PE_reads_1 PE_reads_2

Prakroothi commented 4 years ago

Hi I Have the same issue. I tried to run the commands above. bwa mem -o <work_dir+prefix>_all.sam <work_dir+prefix> PE_reads_1 PE_reads_2 bwa mem is throwing an error saying -o option is illegal i am using bwa version 0.7.5.a

Do you have any idea why ?

Thank you for your time!!

Kroo

FelipeMelis commented 4 years ago

@Prakroothi you have the same problem? try not using the --proc option

mshrngci118 commented 2 years ago

@Prakroothi It seems like BWA MEM does not have -o option. I think you just need to redirect to output file.

  mem: invalid option -- 'h'

   Usage: bwa mem [options] <idxbase> <in1.fq> [in2.fq]

   Algorithm options:

   -t INT     number of threads [1]
   -k INT     minimum seed length [19]
   -w INT     band width for banded alignment [100]
   -d INT     off-diagonal X-dropoff [100]
   -r FLOAT   look for internal seeds inside a seed longer than {-k} * FLOAT [1.5]
   -c INT     skip seeds with more than INT occurrences [10000]
   -S         skip mate rescue
   -P         skip pairing; mate rescue performed unless -S also in use
   -A INT     score for a sequence match [1]
   -B INT     penalty for a mismatch [4]
   -O INT     gap open penalty [6]
   -E INT     gap extension penalty; a gap of size k cost {-O} + {-E}*k [1]
   -L INT     penalty for clipping [5]
   -U INT     penalty for an unpaired read pair [17]

   Input/output options:

   -p         first query file consists of interleaved paired-end sequences
   -R STR     read group header line such as '@RG\tID:foo\tSM:bar' [null]

   -v INT     verbose level: 1=error, 2=warning, 3=message, 4+=debugging [3]
   -T INT     minimum score to output [30]
   -a         output all alignments for SE or unpaired PE
   -C         append FASTA/FASTQ comment to SAM output
   -H         hard clipping
   -M         mark shorter split hits as secondary (for Picard/GATK compatibility)

   Note: Please read the man page for detailed description of the command line and options.

@FelipeMelis '--proc' option is actually only meant for multiprocessing in this tool. It does not affect to threading for external tools. In the multiprocess, there're only 5 processes running, so even if you give more than 5, it still won't speed up. Also, I noticed that Bowtie2 uses 5 threads in NucBreak process. I don't know how it's gonna affect to the other processes. For pilon process, default memory setting may not be enough for running it. It's better to increase it by giving option like '-Xmx10g'. I rewrote some part of the code, but it seems like the last process is still not running properly.