Closed BachBioinformatics closed 1 year ago
Hello,
That error means that CGView has detected a sequence feature in the input that has a start position that is larger than the length of the sequence being drawn.
I would need to see the input files and full command to identify the feature causing the error.
Paul
On Sun, Oct 16, 2022 at 10:15 PM BachBioinformatics < @.***> wrote:
Trying to do redraw_maps.sh -p myproject -f svg or build_blast_atlas.sh -p myproject -m 48g but i am getting the same following error :
org.xml.sax.SAXException: value for 'start' attribute in featureRange element must be less than or equal to the length of the plasmid in null at line 52 column 48 at ca.ualberta.stothard.cgview.CgviewFactory.handleFeatureRange(CgviewFactory.java:3570) at ca.ualberta.stothard.cgview.CgviewFactory.startElement(CgviewFactory.java:669) at org.apache.xerces.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:497) at org.apache.xerces.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:180) at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:275) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1654) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324) at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:845) at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:768) at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108) at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1201) at ca.ualberta.stothard.cgview.CgviewFactory.createCgviewFromFile(CgviewFactory.java:445) at ca.ualberta.stothard.cgview.CgviewIO.main(CgviewIO.java:1474) The following error occurred: org.xml.sax.SAXException: value for 'start' attribute in featureRange element must be less than or equal to the length of the plasmid in null at line 52 column 48
Any tips?
Many thanks
— Reply to this email directly, view it on GitHub https://github.com/paulstothard/cgview_comparison_tool/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL64CMYJBW6FILDOYJIEQVDWDTHFLANCNFSM6AAAAAARGVM2SY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Dr. Paul Stothard - Professor Department of Agricultural, Food & Nutritional Science (AFNS) University of Alberta Edmonton, Alberta T6G 2C8 Canada
@. @. https://sites.ualberta.ca/~stothard/ office: 2-31 General Services Bldg phone: 1.780.492.5242 mobile: 1.780.297.5242
Hello Paul Thanks a lot for your reply,
I am familiar with gview webserver, but I could not use it https://server.gview.ca/
to do core analysis because the .gbk sizes of my files are huge and uploading them to the server could be a very slow journey
In fact, I have a set of contigs from a clinical sample assembled with spades and annotated with prokka, the produced GenBank file from gff prokka annotations is relatively big(>15 G). This file was copied into the subfolder comparison_genome
A reference was built from SRA with the same tools(spades -> prokka-> gff annotations converted to GenBank format). This file was automatically used as input by CGView Comparison Tool and copied to the subfolder reference_genome of the created project
So I installed the CGView Comparison Tool locally on my server
1- step 1 : create my project folder using a ref input in Genbank format
build_blast_atlas.sh -i Refspades_contigs.gbk
cp myclinicalsample_contigs.gbk Refspades_contigs/comparison_genone
cp myclinicalsample_contigs.fasta Refspades_contigs/comparison_genone
2- step 2
So I got the previous error mentioned in the first comment after running the following command
build_blast_atlas.sh -p spades_contigs -m 48g
Failed after running this :
java -Djava.awt.headless=true -jar -Xmx48g -jar /opt/software/cgview_comparison_tool/bin/cgview/cgview.jar -i 'Refspades_contigs/cct_projects/dna_vs_dna/maps/cgview_xml/dna_vs_dna_large.xml' -f png -o 'Refspades_contigs/cct_projects/dna_vs_dna/maps/dna_vs_dna_large.png' -h 'Refspades_contigs/cct_projects/dna_vs_dna/maps/dna_vs_dna_large.html' -p 'dna_vs_dna_large.png'
the Refspades_contigs/cct_projects/dna_vs_dna/maps/cgview_xml/dna_vs_dna_large.xml
file is higher than 8 G.
on line 52, I can observe something like this
<feature color="rgb(0,0,153)" decoration="clockwise-arrow" opacity="0.5" label="CAKDEMEL_00083" mouseover="CAKDEMEL_00083; 9711309 to 9712295; hypothetical protein" >
<featureRange start="9711309" stop="9712295" />
but I cannot see the length feature I can send you a compressed version of dna_vs_dna_large.xml file, not sure i can upload it here
Hello,
The error would be triggered by the contents of Refspades_contigs.gbk. How big is that file?
If you send it I could try to look for the issue.
Paul
On Mon, Oct 17, 2022 at 12:32 PM @@. @.> wrote:
Hello, Thanks for your reply,
Shall I edit the XML file?
In fact, I have a set of contigs from a clinical sample assembled with spades and annotated with prokka, the produced GenBank file from gff prokka annotations is relatively big(>15 G)
A reference was built from SRA in the same tools(spades -> prokka-> annotations GenBank format) I could not use directly the server gview https://server.gview.ca/ to do core analysis as i get used to because the .gbk sizes here are huge and uploading them to the given server could be very slow journey
So I installed the CGView Comparison Tool locally on my server
1- step 1 : create my project folder from a ref GenBank input
build_blast_atlas.sh -i Refspades_contigs.gbk cp myclinicalsample_contigs.gbk Refspades_contigs /comparison_genone
2- step 2 So I got the previous error mentioned in the first comment after running the following command build_blast_atlas.sh -p spades_contigs -m 48g
java -Djava.awt.headless=true -jar -Xmx48g -jar /opt/software/cgview_comparison_tool/bin/cgview/cgview.jar -i 'Refspades_contigs/cct_projects/dna_vs_dna/maps/cgview_xml/dna_vs_dna_large.xml' -f png -o 'Refspades_contigs/cct_projects/dna_vs_dna/maps/dna_vs_dna_large.png' -h 'Refspades_contigs/cct_projects/dna_vs_dna/maps/dna_vs_dna_large.html' -p 'dna_vs_dna_large.png'
— Reply to this email directly, view it on GitHub https://github.com/paulstothard/cgview_comparison_tool/issues/10#issuecomment-1281306096, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL64CM4NBX6KXUVWCDEAG2DWDWLU5ANCNFSM6AAAAAARGVM2SY . You are receiving this because you commented.Message ID: @.***>
-- Dr. Paul Stothard - Professor Department of Agricultural, Food & Nutritional Science (AFNS) University of Alberta Edmonton, Alberta T6G 2C8 Canada
@. @. https://sites.ualberta.ca/~stothard/ office: 2-31 General Services Bldg phone: 1.780.492.5242 mobile: 1.780.297.5242
Hello again,
Refspades_contigs.gbk is 15G
Hi,
Based on the sizes of the files I suspect that the total length of the contigs in the reference file will exceed the limits of the program (when I wrote it, next generation sequencing didn't exist).
The program is designed for a reference genome with a total length of less than 10 megabases. It is typically used to compare an assembled bacterial genome (the "reference") to other bacterial genomes (the "comparison" genomes).
Based on the file sizes this sounds like it this sample may consist of multiple bacterial species and other DNA sources.
If so I would use a metagenomics assembly pipeline like this that can recover individual genomes (maybe this is similar to what you are doing):
Paul
On Mon, Oct 17, 2022 at 12:59 PM @@. @.> wrote:
Hello again,
Refspades_contigs.gbk is 15G
— Reply to this email directly, view it on GitHub https://github.com/paulstothard/cgview_comparison_tool/issues/10#issuecomment-1281333933, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL64CM5UF6HUJKD4JUTCJPLWDWOZBANCNFSM6AAAAAARGVM2SY . You are receiving this because you commented.Message ID: @.***>
-- Dr. Paul Stothard - Professor Department of Agricultural, Food & Nutritional Science (AFNS) University of Alberta Edmonton, Alberta T6G 2C8 Canada
@. @. https://sites.ualberta.ca/~stothard/ office: 2-31 General Services Bldg phone: 1.780.492.5242 mobile: 1.780.297.5242
Thank you, makes sense, it is better to do it with specific MAG around 10MB
Trying to do
redraw_maps.sh -p myproject -f svg
orbuild_blast_atlas.sh -p myproject -m 48g
but i am getting the same following error :Any tips to fix this error? Many thanks