Open gillianh1 opened 2 years ago
Hi,
Please attach the log files to better understand the problem. Logs are available in the menu Help -> Logs
The file was created successfully using 2.6.0 but we where unable to open using 2.6 We have since been able to connect to same database and user using version 2.6.1 and have been able to create a new extract file and open the 92MB file. We are however still unable to load the original file created using version 2.6 in 2.6.1 desktop exe. We are able to open the new file created in version 2.6.1 using version 2.6 desktop exe. I will upload the log
This seems to be the issue, a non-hex character in input.
2022-09-16 11:47:58,262 [http-nio-auto-1-exec-7] ERROR o.a.solr.handler.RequestHandlerBase - org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Non-hex character in Unicode escape sequence: o
org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Non-hex character in Unicode escape sequence: o
at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:212)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:333)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2637)
at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:227)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1003)
at com.databasepreservation.common.server.index.utils.SolrUtils.find(SolrUtils.java:155)
at com.databasepreservation.common.server.index.DatabaseRowsSolrManager.find(DatabaseRowsSolrManager.java:178)
at com.databasepreservation.common.api.v1.DatabaseResource.getViewerDatabaseIndexResult(DatabaseResource.java:97)
at com.databasepreservation.common.api.v1.DatabaseResource.find(DatabaseResource.java:71)
Generally, the XML might be malformed, it started using an Unicode escape sequence but then put an "o" instead of a number. So you must look into the SIARD content to see where this came from.
The SIARD file was produced using DBPTK Desktop (Using dbptk-desktop-2.6.0.exe)
No error was received when file was produced. So how would we know there was an issue with the file? Do we always need to open and validate the file. Can we not assume a file is OK if SIARD file created without error?
If rename the SIARD file with a .zip extension we can navigate the files.
We have subsequently create a new file using dbptk-desktop-2.6.1.exe pointing to the same user an database and this file is OK so it is not an issue with the tables/data being extracted from the database.
I will try generating the file again from 2.6.0 Desktop version to see if can reproduce the issue.
I was able to extract, import and validate the file in version 2.6. This time the file does open. I have access to both files and both files are the same size. I saved both files as .zip and was able to navigate all files/tables. I will attach the log.
Latest log
Original file from 2.6 will not load (uoesiardschema_extract.siard) New file from 2.6 will load (2.6_uoesiardschema_extract.siard)
Hi @gillianh1 thank you for using and testing DBPTK and your feedback. Since version 2.6.1 is working fine I suggest you using that version instead of 2.6.0.
This is what I plan to do. My only concern is that a file that was produced without error yet it cannot be opened. I would not like to be in this position when try to open a SIARD file in the future.
Is your recommendation to create, open and validate each file that is produced before archiving?
Thanks
The validation step is essential to have a proof that the produced SIARD is following the specification.
To ensure that no record is lost you can use a module called Merkle Tree filter documentation available here. However this requires to have a stored procedure that calculates the hash for every column exported using the Merkle tree top hash algorithm.
DBPTK offers you a set of tools to validate and verify completeness and correctness. And as a rule of thumb you should create, open and validate to see if the extract process went well.
Thank you for you help and confirmation.
Description: Generated a file using DBPTK desktop. Contains 5 tables and the file is 92MB. When try to open file in DBPTK desktop a blue progress dot pulses on the open option but the file never loads.
Context: DBPTK Desktop: Installed on Windows 10 PC Using dbptk-desktop-2.6.0.exe
Steps required to reproduce the bug:
Is there any documentation on hardware/sizing requirements or limitations?