AppliedBioinformatics / runBNG

An easy way to run BioNano genomic analysis
MIT License
27 stars 7 forks source link

error: input Bionano Solve folder is not the original folder #23

Closed olechnwin closed 3 years ago

olechnwin commented 3 years ago

Hello,

I am trying to run denovo assembly using the following command but got an error regarding the script folder.

~/opt/runBNG/runBNG denovo \
        -t      ~/opt/bionano/tools \
        -s      ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020/HybridScaffold/1.0/scripts \
        -b      ~/Data/A673_bionano/EwingSarcoma_November2019/EwingsSarcoma_Solve3.4_pipeline_results/output/all.bnx \
        -T      1 \
        -j      1 \
        -z      3137 \
        -r      ~/Data/A673_bionano/runBNG_rslt/hg19_DLE1_20kb_5labels.cmap \
        -o      ~/Data/A673_bionano/runBNG_rslt

Oops! It seems that the input Bionano Solve folder is not the original folder or not readable. Please check!
You may also download it from http://bnxinstall.com/solve/BionanoSolveInstall.html

I've tried several other locations of the scripts as shown below:

~/opt/bionano/tools/pipeline/1.0/Pipeline/1.0
~/opt/bionano/tools/pipeline/1.0

The content of ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020/HybridScaffold/1.0/scripts is:

ls ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020/HybridScaffold/1.0/scripts

align_final_bng.pl                                      estimate_fasta_stats.pl
align_final_seq_two_passes.pl                           estimate_hybrid_scaffold_input_stats.pl
align_molecules.pl                                      ExportAGP.pl
AssignAlignType.pl                                      ExportAGP_TwoEnzyme.pl
calc_chim_score.pl                                      fa2cmap_multi_color.pl
calc_cmap_stats.pl                                      fa_key_convert.pl
calc_conflicts_cut_stats.pl                             fasta_filter.py
calc_fasta_stats.pl                                     find_used_not_used_bn.pl
calc_hybrid_scaffold_not_scaffolded_seq_fasta_stats.pl  find_used_not_used_ngs.pl
calc_scaffolded_seq_fasta_stats.pl                      MergeNGS_BN.pl
calc_xmap_stats.pl                                      perl5
cut_conflicts.pl                                        test
estimate_cmap_stats.pl

I've also tried a folder which contains hybridScaffold.pl :

ls ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020/HybridScaffold/1.0
hybridScaffold_config.xml       hybridScaffold_DLE1_HiC_config.xml  Makefile  scripts
hybridScaffold_DLE1_config.xml  hybridScaffold.pl 

But got similar error:

Please use the original BioNano Solve folder.
You may download it from http://bnxinstall.com/solve/BionanoSolveInstall.html, and give a read permission

Can you please let me know which script folder I should point to? Thank you!

yyx8671 commented 3 years ago

Hi @olechnwin,

It seems you were using the old runBNG. If it is, you may use the old Bionano package which can be downloaded at: http://doi.org/10.5281/zenodo.4661675

For the latest runBNG, you can download the offical Bionano Solve at https://bionanogenomics.com/support/software-downloads

olechnwin commented 3 years ago

Hi @yyx8671,

Thanks for your reply. I cloned the runBNG git on 4/1/2021 so I would think it's the new runBNG?

./runBNG

------------------------------------------------------------------------------------------------------
Program:  runBNG
Version:  2.0

or is there anything different that I should do? I'm currently using Bionano Solve 3.6.1 which I downloaded from the official bionano solve, the same link you posted above.

edit: never mind. I see that it is currently 2.0.1. I'll try this version. Thank you!

yyx8671 commented 3 years ago

Hi @olechnwin,

According to runBNG v2.0, 'runBNG denovo' should be

------------------------------------------------------------------------
Synopsis: De novo assembly for Bionano single molecules

Usage:    runBNG denovo [options]
------------------------------------------------------------------------
-P <str>  platform used to generate optical maps <irys|saphyr>. Default: saphyr
-H <str>  haplotype based or not <yes|no>. Default: no (if human data, select: yes)
-M <str>  the data from human or not <yes|no>. Default: no
-e <str>  the enzyme is DLE1 or not <yes|no>. Default: yes
-c <str>  cut complex multi-path Regions (>= 140 Kb) <yes|no>. Default: yes
          This could decrease the number of chimeric maps and increase the total number of assembled maps.
          For human genome, it's recommended. For other species try both.
-E <str>  extend and split for maps <yes|no>. Default: yes
-t <str>  full path to RefAligner folder
-s <str>  full path to BioNano Solve folder
-b <str>  the raw molecule map file (e.g. Molecules.bnx)
-r <str>  the digested reference (.cmap). Default: NULL
-a <flt>  label density (*/100kb). Default: NULL
-T <int>  number of threads or CPUs
-l <int>  minimum length to filter out (Kb). Default: irys [150]; saphyr [120] 
-m <int>  minimum labels on the molecule. Default: 8 
-B <flt>  maximum backbone intensity. Default: 0.6 
-j <int>  number of threads for each subjob
-i <int>  times of iteration. Default: 5
-k <int>  skip steps, using previous result. 0:None, 1:ImgDetect, 2:NoiseChar/Subsample, 
          3:Pairwise, 4:Assembly, 5:RefineA, 6:RefineB, 7:merge0, 8+(i-1)*2:Ext(i), 
          9+(i-1)*2:Mrg(i), N+1:alignmol. Default: 0
-z <int>  the genome size of input species (Mb)
-p <flt>  flase positive density (/100Kb). Default: 2.0
-n <flt>  false negative rate (%/100). Default: 0.10 
-d <flt>  scalingSD (Kb^1/2). Default: 0.0 
-f <flt>  siteSD (Kb). Default: irys [0.15]; saphyr [0.12]
-R <flt>  relativeSD. Default: 0.03
-L <int>  large jobs maximum memory (GB). Default: 128
-S <int>  small jobs maximum memory (GB). Default: 8 
-o <str>  full path to the output directory

-h/--help show this message and exit
------------------------------------------------------------------------

In your command line, '-t' should be followed by the full path to the folder of RefAligner. '-s' should be followed by the full path to Bionano solve ('Solve3.6.1_11162020') .

Hope this helps.

olechnwin commented 3 years ago

Hi @yyx8671 ,

Thank you for posting the help command above. Absolutely my fault. I was so used to running badly documented programs that I didn't even try the help command. I just blindly followed the one in the github without realizing they are different. That help command is clear.

I have to say your documentation is great. The error messages and checks that are performed are very helpful. It's way better than most programs I've used.

There is just a minor bug (I think). When I run this command:

./runBNG denovo \
        -t      ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 \
        -s      ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020 \
        -b      ~/Data/A673_bionano/EwingSarcoma_November2019/EwingsSarcoma_Solve3.4_pipeline_results/output/all.bnx \
        -M      yes \
        -T      1 \
        -j      1 \
        -z      3137 \
        -r      ~/Data/A673_bionano/runBNG_rslt/hg19_DLE1_20kb_5labels.cmap \
        -o      ~/Data/A673_bionano/runBNG_rslt

I got this error message: Oops! Please specify the label density (-B) for your reference!

Label density is -anot -B.

yyx8671 commented 3 years ago

Thanks @olechnwin,

The bug has been fixed in version 2.0.1. You may update runBNG to the latest version.

zhoudreames commented 3 years ago

@olechnwin do you fixed th bugs? I have same problem ,could you provide me run code ? thanks~

olechnwin commented 3 years ago

@zhoudreames, you probably want to tag @yyx8671 not me. But, if you are referring to the error input bionano solve folder is not the original folder and you are running runBNG version 2.0 then you just follow what @yyx8671 shows below:

Hi @olechnwin,

According to runBNG v2.0, 'runBNG denovo' should be

------------------------------------------------------------------------
Synopsis: De novo assembly for Bionano single molecules

Usage:    runBNG denovo [options]
------------------------------------------------------------------------
-P <str>  platform used to generate optical maps <irys|saphyr>. Default: saphyr
-H <str>  haplotype based or not <yes|no>. Default: no (if human data, select: yes)
-M <str>  the data from human or not <yes|no>. Default: no
-e <str>  the enzyme is DLE1 or not <yes|no>. Default: yes
-c <str>  cut complex multi-path Regions (>= 140 Kb) <yes|no>. Default: yes
          This could decrease the number of chimeric maps and increase the total number of assembled maps.
          For human genome, it's recommended. For other species try both.
-E <str>  extend and split for maps <yes|no>. Default: yes
-t <str>  full path to RefAligner folder
-s <str>  full path to BioNano Solve folder
-b <str>  the raw molecule map file (e.g. Molecules.bnx)
-r <str>  the digested reference (.cmap). Default: NULL
-a <flt>  label density (*/100kb). Default: NULL
-T <int>  number of threads or CPUs
-l <int>  minimum length to filter out (Kb). Default: irys [150]; saphyr [120] 
-m <int>  minimum labels on the molecule. Default: 8 
-B <flt>  maximum backbone intensity. Default: 0.6 
-j <int>  number of threads for each subjob
-i <int>  times of iteration. Default: 5
-k <int>  skip steps, using previous result. 0:None, 1:ImgDetect, 2:NoiseChar/Subsample, 
          3:Pairwise, 4:Assembly, 5:RefineA, 6:RefineB, 7:merge0, 8+(i-1)*2:Ext(i), 
          9+(i-1)*2:Mrg(i), N+1:alignmol. Default: 0
-z <int>  the genome size of input species (Mb)
-p <flt>  flase positive density (/100Kb). Default: 2.0
-n <flt>  false negative rate (%/100). Default: 0.10 
-d <flt>  scalingSD (Kb^1/2). Default: 0.0 
-f <flt>  siteSD (Kb). Default: irys [0.15]; saphyr [0.12]
-R <flt>  relativeSD. Default: 0.03
-L <int>  large jobs maximum memory (GB). Default: 128
-S <int>  small jobs maximum memory (GB). Default: 8 
-o <str>  full path to the output directory

-h/--help show this message and exit
------------------------------------------------------------------------

In your command line, '-t' should be followed by the full path to the folder of RefAligner. '-s' should be followed by the full path to Bionano solve ('Solve3.6.1_11162020') .

Hope this helps.

If you're referring to the label density then you just have to add -a, in my case I'm using 9. This is the command that I ran:

./runBNG denovo \
        -t      ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 \
        -s      ~/opt/bionano/tools/pipeline/Solve3.6.1_11162020 \
        -b      ~/Data/A673_bionano/EwingSarcoma_November2019/EwingsSarcoma_Solve3.4_pipeline_results/output/all.bnx \
        -M      yes \
        -T      1 \
                -a      9 \
        -j      1 \
        -z      3137 \
        -r      ~/Data/A673_bionano/runBNG_rslt/hg19_DLE1_20kb_5labels.cmap \
        -o      ~/Data/A673_bionano/runBNG_rslt