AppliedBioinformatics / runBNG

An easy way to run BioNano genomic analysis
MIT License
27 stars 7 forks source link

ERROR: Invalid xml file /root/runBNG-master/Examples/optArguments_new.xml: StartTag: invalid element name, line 115, column 10 #26 #26

Closed Missandei-hcl closed 3 years ago

Missandei-hcl commented 3 years ago

Hello! I typed "./runBNG denovo -P irys -H no -M no -e no -c yes -E yes -t /home/len/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 -s /home/len /tools/pipeline/Solve3.6.1_11162020 -b /root/Examples/Molecules.bnx -r /root/Examples/Test_BSPQI_20kb_5labels.cmap -T 2 -l 150 -m 8 -B 0.6 -j 1 -i 5 -z 5 -o /root/runBNG-master/Examples", the operation error is as follows, I have been stuck with this problem for a long time, I don't know how to solve it, I hope to get your help, thank you very much!

./runBNG denovo -P irys -H no -M no -e no -c yes -E yes -t /home/len/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 -s /home/len/tools/pipeline/Solve3.6.1_11162020 -b /root/Examples/Molecules.bnx -r /root/Examples/Test_BSPQI_20kb_5labels.cmap -T 2 -l 150 -m 8 -B 0.6 -j 1 -i 5 -z 5 -o /root/runBNG-master/Examples ========================================== De novo assembly starts =============================================

Start date: 2021年 04月 09日 星期五 15:06:31 CST The data generation platform is: irys The bnx file is: /root/Examples/Molecules.bnx The digested reference is: /root/Examples/Test_BSPQI_20kb_5labels.cmap The minimum molecule length is (Kb): 150 The minimum label on a molecule is: 8 Maximum backbone intensity is: 0.6 The path to Bionano Solve folder is: /home/len/tools/pipeline/Solve3.6.1_11162020 The path to Bionano RefAligner folder is: /home/len/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 The number of threads is: 2 Large jobs maximum memory (GB) is: 128 Small jobs maximum memory (GB) is: 8 The number of threads for each subjob is: 1 The number of iterations is: 5 False Positive Density (/100Kb) [FP]: 2.0 False Negative Rate (%/100) [FN]: 0.10 ScalingSD (Kb^1/2) [sd]: 0.0 SiteSD (Kb) [sf]: 0.15 RelativeSD [sr]: 0.03 The genome size (Mb) is: 5 The xml file is: /home/len/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0/optArguments_nonhaplotype_irys.xml The output directory is: /root/runBNG-master/Examples Prerun Tests: 1 ERRORS 0 WARNINGS

Disabling autoNoise during rough Assembly

ERROR: Invalid xml file /root/runBNG-master/Examples/optArguments_new.xml: StartTag: invalid element name, line 115, column 10

yyx8671 commented 3 years ago

It seems there is someting wrong with the optArguments_nonhaplotype_irys.xml file provided by Bionano Genomics in version 3.6 and 3.6.1. You may use Bionano Solve v3.5.1 to complete your task.

Here are my tests.

Bionano Solve v3.6.1

$ python ~/Tools/bionano/tools/pipeline/Solve3.6.1_11162020/Pipeline/1.0/pipelineCL.py \
-R -w -d -U -T 4 -j 4 -je 4 -jp 4 -J 4 -TJ 4 -Te 4 -Tp 4 -i 5 -B 0 \
-t ~/Tools/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 \
-a ~/Tools/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0/optArguments_nonhaplotype_irys.xml \
-l denovo -b ~/Tools/runBNG/Examples/Molecules.bnx -r fa2cmap/Test_BSPQI_20kb_5labels.cmap
  Prerun Tests:
    1 ERRORS
    0 WARNINGS

Disabling autoNoise during rough Assembly

  ERROR: Invalid xml file /home/yyuan/Tools/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0/optArguments_nonhaplotype_irys.xml: StartTag: invalid element name, line 115, column 10 (line 115)

Pipeline Version: $Id: Multithreading.py 11646 2020-09-25 16:50:59Z Elam $

  EXITING: See errors

Bionano Solve v3.6

$ python ~/Tools/bionano/tools/pipeline/Solve3.6_09252020/Pipeline/1.0/pipelineCL.py \
 -R -w -d -U -T 4 -j 4 -je 4 -jp 4 -J 4 -TJ 4 -Te 4 -Tp 4 -i 5 -B 0 \
-t ~/Tools/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 \
-a ~/Tools/bionano/tools/pipeline/Solve3.6_09252020/RefAligner/1.0/optArguments_nonhaplotype_irys.xml \
-l denovo -b ~/Tools/runBNG/Examples/Molecules.bnx -r fa2cmap/Test_BSPQI_20kb_5labels.cmap
  Prerun Tests:
    1 ERRORS
    0 WARNINGS

Disabling autoNoise during rough Assembly

  ERROR: Invalid xml file /home/yyuan/Tools/bionano/tools/pipeline/Solve3.6_09252020/RefAligner/1.0/optArguments_nonhaplotype_irys.xml: StartTag: invalid element name, line 115, column 10 (line 115)

Pipeline Version: $Id: Multithreading.py 11646 2020-09-25 16:50:59Z Elam $

  EXITING: See errors

Bionano Solve v3.5.1

$ python ~/Tools/bionano/Solve3.5.1_01142020/Pipeline/1.0/pipelineCL.py \
-R -w -d -U -T 4 -j 4 -je 4 -jp 4 -J 4 -TJ 4 -Te 4 -Tp 4 -i 5 -B 0 \
-t ~/Tools/bionano/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 \
-a ~/Tools/bionano/Solve3.5.1_01142020/RefAligner/1.0/optArguments_nonhaplotype_irys.xml \
-l denovo -b ~/Tools/runBNG/Examples/Molecules.bnx -r fa2cmap/Test_BSPQI_20kb_5labels.cmap
  Prerun Tests:
    0 ERRORS
    0 WARNINGS

Tools Version: N/A
Solve Version: N/A
Pipeline Version: 10322
RefAligner Version: 11643

Assembly ID: 581

  Pipeline start time: Fri Apr  9 16:10:13 2021

checkScanScaling: autoNoise= False
Not performing autoNoise (see -y): setting doScanScale = False, hence not using _rescaled.bnx for BNX splits
Executing stage number 1 : AutoNoise + SplitBNX

Molecule Stats (/home/yyuan/Tools/runBNG/Examples/denovo/all.bnx):
Total number of molecules:   7326
Total length (Mbp)       :   1595.095
Average length (kbp)     :    217.731
Molecule N50 (kbp)       :    236.401
Label density (/100kb)   :     11.015
Missandei-hcl commented 3 years ago

Hi, The bionano solve on Bionano's official website only has version 3.6.1. Is it convenient for you to provide me with a 3.5.1 version of bionano solve? Looking forward to receiving your reply, thank you very much!

yyx8671 commented 3 years ago

Hi @Missandei-hcl,

the file is big (> 2Gb) and i'm afraid that it is not easy to share. If you have DLE1 data, you can use the latest runBNG and the latest Bionano Solve to test. If you only have irys data, you can use runBNG v1.3 and the old Bionano packages to test.

xiaoquexingchen commented 3 years ago

Hello! I also used the test data given on github to have the problem of "ERROR: Invalid xml file /root/runBNG-master/Examples/optArguments_new.xml: StartTag: invalid element name, line 115, column 10 #26". Then I now switch to the YH genome published on the official website of bionano for testing. I would like to ask if my setting is correct"python /home/len/tools/pipeline/Solve3.6.1_11162020/Pipeline/1.0/pipelineCL.py -R -w -d -U -T 2 -j 1 -t /home/len/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0 -l denovo -b /home/len/YH_71x_150kb/output/all.bnx -r /home /len/YH_71x_150kb/output/ref/hg19_chromosome_bspq_res20.cmap -a /home/len/tools/pipeline/Solve3.6.1_11162020/RefAligner/1.0/optArguments_nonhaplotype_saphyr_human.xml".

Some of the results of the operation are as follows: Prerun Tests: 0 ERRORS 0 WARNINGS

Hostname: localhost.localdomain

Tools Version: N/A

Solve Version: N/A

Pipeline Version: 11646

RefAligner Version: 11643

Assembly ID: 358

Pipeline start time: Sat Apr 10 11:10:09 2021

('checkScanScaling: autoNoise=', False) Not performing autoNoise (see -y): setting doScanScale = False, hence not using _rescaled.bnx for BNX splits

Executing stage number 1 : AutoNoise + SplitBNX

Molecule Stats (/home/len/denovo/all.bnx): Total number of molecules: 932855 Total length (Mbp) : 223457.459 Average length (kbp) : 239.541 Molecule N50 (kbp) : 238.861 Label density (/100kb) : 12.408

Sorting /home/len/denovo/all.bnx into /home/len/denovo/all_sorted

Starting Multi-Threaded Process: SortBNX Running 1 jobs with 2 threads, sleepTime=0.01 START 1: SortBNX, 48 Thr, 1 R, 1 T, 0 F, 0 Q STOP 1: SortBNX, 48 Thr, 0 R, 1 T, 1 F, 0 Q TotalTime= 0h 0.59m RunTime= 0h 0.60m CPUload=95% host=NA Finished Multi-Threaded Process: SortBNX

calculateNPairwise: Sorted and filtered file has 922035 molecules, 28577886.0 sites, 221085299360.6 bp length total. blocks= 5 -> 6 (N=2)

Increasing number of blocks from 2 to 6

Splitting BNX

Splitting bnx file: /home/len/denovo/all_sorted.bnx

Starting Multi-Threaded Process: SplitBNX Running 1 jobs with 2 threads, sleepTime=0.01 START 1: SplitBNX, 48 Thr, 1 R, 1 T, 0 F, 0 Q STOP 1: SplitBNX, 48 Thr, 0 R, 1 T, 1 F, 0 Q TotalTime= 0h 0.54m RunTime= 0h 0.55m CPUload=97% host=NA Finished Multi-Threaded Process: SplitBNX

Expecting: /home/len/denovo/all_1_of_6.bnx, /home/len/denovo/all_2_of_6.bnx, /home/len/denovo/all_3_of_6.bnx, /home/len/denovo/all_4_of_6.bnx, /home/len/denovo/all_5_of_6.bnx, /home/len/denovo/all_6_of_6.bnx

Executing stage number 2 : Pairwise + Alignmolvref

Starting Multi-Threaded Process: Pairwise Running 21 jobs with 2 threads, sleepTime=0.01 START 1: Pairwise 1 of 21, 2 Thr, 1 R, 21 T, 0 F, 20 Q STOP 1: Pairwise 1 of 21, 2 Thr, 0 R, 21 T, 1 F, 20 Q TotalTime= 0h 10.45m RunTime= 0h 10.50m CPUload=112% host=NA START 2: Pairwise 2 of 21, 2 Thr, 1 R, 21 T, 1 F, 19 Q STOP 2: Pairwise 2 of 21, 2 Thr, 0 R, 21 T, 2 F, 19 Q TotalTime= 0h 13.49m RunTime= 0h 13.54m CPUload=112% host=NA START 3: Pairwise 3 of 21, 2 Thr, 1 R, 21 T, 2 F, 18 Q Pairwise: jobName= Pairwise 4 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 5 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 6 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 7 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 8 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 9 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 10 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 11 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 12 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 13 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 14 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 15 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 16 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 17 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 18 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 19 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 20 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 21 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: waited 1800.00 seconds for job completion: UnsubmittedJobs= 18, ActiveJobs= 1, FinishedJobs= 2 STOP 3: Pairwise 3 of 21, 2 Thr, 0 R, 21 T, 3 F, 18 Q TotalTime= 0h 13.44m RunTime= 0h 13.51m CPUload=111% host=NA START 4: Pairwise 4 of 21, 2 Thr, 1 R, 21 T, 3 F, 17 Q STOP 4: Pairwise 4 of 21, 2 Thr, 0 R, 21 T, 4 F, 17 Q TotalTime= 0h 13.19m RunTime= 0h 13.20m CPUload=112% host=NA START 5: Pairwise 5 of 21, 2 Thr, 1 R, 21 T, 4 F, 16 Q Pairwise: jobName= Pairwise 6 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 7 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 8 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 9 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 10 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 11 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 12 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 13 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 14 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 15 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 16 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 17 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 18 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 19 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 20 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: jobName= Pairwise 21 of 21: not on cluster: availableThreads= 0, sJob.maxThreads= 2/2 (skipping) Pairwise: waited 3600.00 seconds for job completion: UnsubmittedJobs= 16, ActiveJobs= 1, FinishedJobs= 4 STOP 5: Pairwise 5 of 21, 2 Thr, 0 R, 21 T, 5 F, 16 Q TotalTime= 0h 13.01m RunTime= 0h 13.07m CPUload=113% host=NA START 6: Pairwise 6 of 21, 2 Thr, 1 R, 21 T, 5 F, 15 Q

xiaoquexingchen commented 3 years ago

Hi @yyx8671 , I am typing "python /home/len/tools/pipeline/Solve3.6.1_11162020/Pipeline/1.0/pipelineCL.py -R -w -d -U -T 2 -j 1 -t /home/len/tools/pipeline /Solve3.6.1_11162020/RefAligner/1.0 -l denovo -b /home/len/YH_71x_150kb/output/all.bnx -r /home /len/YH_71x_150kb/output/ref/hg19_chromosome_bspq_res20.cmap -a /home/len/tools /pipeline/Solve3.6.1_11162020/RefAligner/1.0/optArguments_nonhaplotype_saphyr_human.xml" after the final operation result is reported as follows:

"Finished Multi-Threaded Process: Cmap_Merge_exp_unrefined

ERROR: Contig count 0 <= Assembly minimum 0, exiting

Warning/Error messages: ('warning','Missing end marker in \"/home/len/denovo/contigs/exp_unrefined/exp_unrefined.stdout\" (found \"reshold=10.00\n\" while expecting \ "END of output\")\n') ('error','job has not completed, see stdout=\"/home/len/denovo/contigs/exp_unrefined/exp_unrefined.stdout\"') ('critical','stage Assembly did not produce minimum number of contigs')

Warning/Error summary: 1 warning(s) 1 critical(s) 1 error(s)

WARNING: missing xmap file: /home/len/denovo/contigs/exp_refineFinal1_sv/merged_smaps/exp_refineFinal1_merged.xmap

Pipeline end time: Sat Apr 10 15:31:42 2021 Elapsed time: 261.56m; 4.36h; 0.18d

Pipeline has failed "

Why is this? Hope to get your help.

yyx8671 commented 3 years ago

Hi @xiaoquexingchen,

Please ensure your data is generated using the Irys platform or the Saphyr platform and this decides which configuration file will be used.

Missandei-hcl commented 3 years ago

Hi, Now that we have not got the data, I want to use the test data given by runBNG to test first, but I really can't find the Bionano solve 3.5.1 version. Is it convenient for you to share it with Baidu network disk? Hope you can help me, thank you very much!

yyx8671 commented 3 years ago

Hi @Missandei-hcl,

Recently, there are three users frequently raising issues on runBNG. I suppose you are from the same lab. You may discuss internally before posting the questions. If you are not from the same lab, you may find some DLE1 data for a test. Here (ftp://ftp.ncbi.nlm.nih.gov/pub/supplementary_data/bionanomaps.csv) documents some.

As Bionano Genomics regualarly updates their analysis packages, you may use the latest one in case in your publication, some reviewer may ask why didn't you use a new version