AppliedBioinformatics / runBNG

An easy way to run BioNano genomic analysis
MIT License
27 stars 7 forks source link

runBNG denovo: "ERROR: no align files in alignFolder" #14

Closed liu-xingliang closed 5 years ago

liu-xingliang commented 5 years ago

Hi @yyx8671 ,

Thank you very much for your help in my previous questions, with you help, I am able to run runBNG denovo currently, but it comes cross some errors which I don't have clue what's going on.

I run the program with 32Gb memory and 16 threads:

module load python2/2.7.14
module load R/3.4.3
/tmp/software/runBNG/runBNG/runBNG denovo -s /tmp/software/runBNG/runBNG/scripts -t /tmp/software/runBNG/runBNG/tools -b /tmp/bionano/RawMolecules-I007C-800K.bnx -T 16 -j 1 -z 3000 -L 32 -o test_denovo_out

runBNG denovo complaining no align files in alignFolder.

Running log is attached, FYI:

run.runlog.gz

Thank you very much!

bless~ Xingliang

yyx8671 commented 5 years ago

Hi @liuxl18-hku,

From the log file, it seems the default min length (150 kbp) is not suitable. You may try 100 kbp instead. By the way, 32Gb RAM may be not sufficient, you may request more RAMs if your server can provide.

Cheers, Andy

liu-xingliang commented 5 years ago

Hi Andy @yyx8671 ,

Thank you so much for your prompt reply!

As you suggested, I tried -l 100 and came cross the same error. I go to the align folder, there are a bunch of *.stdout files, I pick one of them, my understanding is the SplitBNX failed, as there is no all_8_of_16.bnx as required (actually, none of those sub-files are generated).

# hostname=test
# $ cd /tmp/bionano/runBNG; /tmp/software/runBNG/runBNG/tools/RefAligner -first -1 -i /tmp/bionano/runBNG/test_denovo_out/all_8_of_16.bnx -i /tmp/bionano/runBNG/test_denovo_out/all_16_of_16.bnx -o /tmp/bionano/runBNG/test_denovo_out/align/exppairwise100of136 -usecolor 1 -FP 1.5 -FN 0.15 -sd 0.0 -sf 0.2 -sr 0.03 -res 3.3 -T 3.33333e-09 -maxmem 7.5 -minlen 100 -minsites 8 -MaxIntensity 0.6 -usecolor 1 -maxsites 200 -mres 0.9 -usecolor 1 -A 5 -S 1 -MaxSE 0.5 -outlier 0.0001 -outlierMax 40. -endoutlier 0 -RepeatMask 2 0.01 -RepeatRec 0.7 0.6 1.4 -PVres 2 -alignscore -maptype 0 -HSDrange 1.0 -hashoffset 1 -f -hashgen 5 3 2.2 1.2 0.05 3.0 1 1 1 -hash -nosplit 2 -align_format 1 -stdout -stderr -maxthreads 1 -XmapStatRead /biomedja01/disk1/liuxl18/platinum/bionano/runBNG/test_denovo_out/molecule_stats.txt
# CompileDir= /home/users3/tanantharaman/branches/4794/4794.5122/4794.5122 CompileCmd=/opt/gcc-4.9.2TSAN/bin/g++ -fopenmp -Ofast -fno-associative-math -mavx -mfpmath=sse -DUSE_PFLOAT=1 -DUSE_RFLOAT=1 -DUSE_SSE=1 -I/home/users/tanantharaman/amdlibm-3-0-2/include -DREPLACE_WITH_AMDLIBM -DUSE_STATIC -flto -DRELEASE=1 -lrt -L/home/users/tanantharaman/amdlibm-3-0-2/lib/static -lamdlibm -L/opt/gcc-4.9.2TSAN/lib64 -s -static SVNversion=5122 $Header: http://svn.bnm.local:81/svn/Informatics/RefAligner/branches/4794/RefAligner.cpp 4916 2016-05-10 18:12:38Z kbhakta $
# FLAGS: USE_SSE=1 USE_AVX=1 USE_MIC=0 USE_PFLOAT=1 USE_RFLOAT=1 DEBUG=1 VERB=1
WARNING: -RepeatMask with -RepeatRec requires -extend 1 or -extend 2 (using -extend 1)
Reading input maps from /tmp/bionano/runBNG/test_denovo_out/all_8_of_16.bnx ... /tmp/bionano/runBNG/test_denovo_out/all_16_of_16.bnx (2 files)
input_bnx:Failed to read input file /tmp/bionano/runBNG/test_denovo_out/all_8_of_16.bnx (1'th of 2 files)

bless~ Xingliang

liu-xingliang commented 5 years ago

Hi @yyx8671

More info.

-rw-r--r-- 1 user1 group1 120K Dec 14 14:09 all_sorted.bnx
-rw-r--r-- 1 user1 group1 119K Dec 14 14:09 all.bnx

bless~ Xingliang

yyx8671 commented 5 years ago

Hi @liuxl18-hku,

From the size, it seems the all.bnx file is small. This file should be a copy of your RawMolecules-I007C-800K.bnx. What's the size of your original bnx file?

Cheers, Andy

liu-xingliang commented 5 years ago

Hi @yyx8671 ,

FYI.

$ls -lh ../RawMolecules-I007C-800K.bnx
-rw-r--r-- 1 user1 group1 1.7G Dec 10 13:38 ../RawMolecules-I007C-800K.bnx

The following is molecular length distribution with SNR >= 6 && min len >= 100 (kb) (given by IrysView):

mole_len

I have tried to use -l 5 and I checked the all_sorted.stdout, it says 0 maps found:

# hostname=test
# $ cd /tmp/bionano/runBNG; /tmp/software/runBNG/runBNG/tools/RefAligner -f -i /tmp/bionano/runBNG/child_denovo_out/all.bnx -maxthreads 1 -merge -sort-idinc -bnx -o /tmp/bionano/runBNG/child_denovo_out/all_sorted -minlen 5 -minsites 8 -MaxIntensity 0.6 -usecolor 1 -maxsites 200 -mres 0.9 -XmapStatWrite /tmp/bionano/runBNG/child_denovo_out/molecule_stats.txt -stdout -stderr
# CompileDir= /home/users3/tanantharaman/branches/4794/4794.5122/4794.5122 CompileCmd=/opt/gcc-4.9.2TSAN/bin/g++ -fopenmp -Ofast -fno-associative-math -mavx -mfpmath=sse -DUSE_PFLOAT=1 -DUSE_RFLOAT=1 -DUSE_SSE=1 -I/home/users/tanantharaman/amdlibm-3-0-2/include -DREPLACE_WITH_AMDLIBM -DUSE_STATIC -flto -DRELEASE=1 -lrt -L/home/users/tanantharaman/amdlibm-3-0-2/lib/static -lamdlibm -L/opt/gcc-4.9.2TSAN/lib64 -s -static SVNversion=5122 $Header: http://svn.bnm.local:81/svn/Informatics/RefAligner/branches/4794/RefAligner.cpp 4916 2016-05-10 18:12:38Z kbhakta $
# FLAGS: USE_SSE=1 USE_AVX=1 USE_MIC=0 USE_PFLOAT=1 USE_RFLOAT=1 DEBUG=1 VERB=1
Reading input maps from /tmp/bionano/runBNG/child_denovo_out/all.bnx
Detected SNR information (QX11) on line 469 in /tmp/bionano/runBNG/child_denovo_out/all.bnx
Detected Intensity information (QX12) on line 470 in /tmp/bionano/runBNG/child_denovo_out/all.bnx
Finished parsing maps in /tmp/bionano/runBNG/child_denovo_out/all.bnx(fileid=0):nummaps=3583471,time=28.473886(wall time=28.542024 secs)
Reduced number of input maps from 3583471 to 0 due to MinLen=5.000,MaxLen=0.000, MinSites=8,MaxSites=200,MaxIntensity=0.60 in file /tmp/bionano/runBNG/child_denovo_out/all.bnx (1 of 1)
Generating -i input map statistics in /biomedja01/disk1/liuxl18/platinum/bionano/runBNG/child_denovo_out/molecule_stats.txt
Read 0 Maps from /tmp/bionano/runBNG/child_denovo_out/all.bnx : total maps=0, sites=0, length=0.000 kb (avg=-nan kb, label density = -nan /100kb):walltime=32.152462
nummaps=0,CmapStart=-1,CmapEnd=-1,CmapBin=0,CmapNumBins=0
Generating /tmp/bionano/runBNG/child_denovo_out/all_sorted.bnx (with 0 maps)
END of output

I guess the data quality is not that good. I am not an expert of bionano data, if it is possible, can I borrow some of your experience that, for those filters MinLen=5.000,MaxLen=0.000, MinSites=8,MaxSites=200,MaxIntensity=0.60, which one I can relax to rescue some maps?

Thank you very much!

bless~ Xingliang

yyx8671 commented 5 years ago

Hi @liuxl18-hku,

You may try to play with MaxIntensity.

Cheers, Andy

liu-xingliang commented 5 years ago

Hi @yyx8671 ,

Thank you very much for your information. Will try.

bless~ Xingliang

yuezhao666 commented 5 years ago

Excuse me, I wonder how you solve the problem at last? I have the same question. Thanks a lot!

best, Joy

Johnsonzcode commented 1 year ago

Hi @liuxl18-hku,

From the size, it seems the all.bnx file is small. This file should be a copy of your RawMolecules-I007C-800K.bnx. What's the size of your original bnx file?

Cheers, Andy

What if the size of all.bnx is not same as raw bnx file? the new all.bnx is quite small.

liu-xingliang commented 1 year ago

Hi @liuxl18-hku, From the size, it seems the all.bnx file is small. This file should be a copy of your RawMolecules-I007C-800K.bnx. What's the size of your original bnx file? Cheers, Andy

What if the size of all.bnx is not same as raw bnx file? the new all.bnx is quite small.

It seems that I disabled the filtering on backbone intensity with option -B 0 for runBNG denovo eventually. My memory is the data quality was not optimal and we did not fully rely on it.

I would suggest you re-create a new thread to ask dev team for their expertie.

Best

Johnsonzcode commented 1 year ago

Thank you for your kind!