BaseXdb / basex

BaseX Main Repository.
http://basex.org
BSD 3-Clause "New" or "Revised" License
661 stars 267 forks source link

Help needed copied path does not return result. #1429

Closed wolski closed 7 years ago

wolski commented 7 years ago

I new to BaseX. I am trying to execute a path which copied with CopyPath from the MAP in the editor: e.g. db:open("myrimatchSubset","20160312_17_B3_myrimatch_2_2_140.xml")/MzIdentML/SequenceCollection/Peptide

Compiling:

However, if I click on the same element in the Map view a result is displayed in the result window.

I am using BaseX 8.6.1 on windows which I did start with: java -Xmx4096m -jar BaseX.jar

Thank you in advance

ChristianGruen commented 7 years ago

Hi Witold, a general note: For general requests on BaseX, and unconfirmed bugs, please write to our basex-talk mailing list.

It’s interesting to hear that the copy path feature is used at all; I must confess we haven’t it for ages ;) Could you please give me some more details on your documents? Do they contain namespaces, etc.? Do you have a little document that we could use for testing?

wolski commented 7 years ago

Hi Christian,

Unfortunately, the files are huge. About 500 MB each. I am happy to share them with you (e.g. FTP). I created a DB loading 4 of these files into it as well as one with only 1 file.

I did some further checks:

open myrimatchSubset inspect Checking main table (127217081 nodes):

  • 0 invalid node kinds
  • 0 invalid parent references
  • 0 wrong parent/descendant relationships No inconsistencies found. 'myrimatchSubset' inspected in 6885.71 ms.

If I execute just db:open("myrimatchSubset","20160312_17_B3_myrimatch_2_2_140.xml") in the editor a document is returned.

... However, If try to run db:open("SingleFile.mzid","20160312_15_B1_myrimatch_2_2_140.xml")/MzIdentML Compiling: - pre-evaluating db:open("SingleFile.mzid", "20160312_15_B1_myrimatch_2_2_140.xml") Optimized Query: db:open-pre("SingleFile.mzid",0)/MzIdentML Query: db:open("SingleFile.mzid","20160312_15_B1_myrimatch_2_2_140.xml")/MzIdentML Result: - Hit(s): 0 Items - Updated: 0 Items - Printed: 0 Bytes - Read Locking: SingleFile.mzid - Write Locking: (none) I have to acknowledge that I am seeing this problem only with those type of files (mzIdentML http://www.psidev.info/mzidentml). These files are generated by peptide id search engines. And I see the same problem with outputs of two different search engines (different vendors). I did also tried the BaseX command line interface and see the same behaviour as with the Editor. > xquery db:open("myrimatchSubset","20160312_17_B3_myrimatch_2_2_140.xml")/MzIdentML Query executed in 19.44 ms. But also querying for any other tag does not return any result Optimized Query: db:open-pre("myrimatchSubset",0)/descendant::SequenceCollection Query: db:open("myrimatchSubset","20160312_15_B1_myrimatch_2_2_140.xml")//SequenceCollection Result: - Hit(s): 0 Items What really puzzles me is that the map view works and you can browse all the elements and the results are displayed. So it seems that the xml is OK. Let me know if you want to have a look at the data please.
ChristianGruen commented 7 years ago

Could you forward the first lines of the document?

wolski commented 7 years ago

Sorry. Will send an e-mail since it seems that I can't copy XML here.

ChristianGruen commented 7 years ago

Ok, so it’s indeed the existence of the namespaces that causes the trouble. If you rewrithe the path as follows, you should get the expected results:

db:open("myrimatchSubset","20160312_17_B3_myrimatch_2_2_140.xml")/*:MzIdentML/*:SequenceCollection/*:Peptide
ChristianGruen commented 7 years ago

In order to make XML visible in the markdown format, you need to prefix and suffix it with three backticks (```).

wolski commented 7 years ago

Just for completeness :

PS D:\projects\p2069_WolfgangFaigle_Citr_BR\dataSearchResults\mzML\myrimatchSubset> Get-Content 20160312_15_B1_myrimatch_2_2_140.mzid  -First 20
<?xml version="1.0" encoding="ISO-8859-1"?>
<MzIdentML id="D:\projects\p2069\dataSearchResults\mzML\WHITE_HD\20160312_15_B1.mzML D:\projects\p2069\dataSearchResults\fasta\p2069_db1_d_20160322.fasta MyriMatch 2.2.140" creationDate="2016-12-21T11:28:35" version="1.1.0" xsi:schemaLocation="http://psidev.info/psi/pi/mzIdentML/1.1 http://psidev.info/files/mzIdentML1
.1.0.xsd" xmlns="http://psidev.info/psi/pi/mzIdentML/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <cvList>
    <cv id="MS" fullName="Proteomics Standards Initiative Mass Spectrometry Ontology" version="3.60.0" uri="http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo"/>
    <cv id="UNIMOD" fullName="UNIMOD" version="2013-11-06" uri="http://www.unimod.org/obo/unimod.obo"/>
    <cv id="UO" fullName="Unit Ontology" version="12:10:2011" uri="http://obo.cvs.sourceforge.net/*checkout*/obo/obo/ontology/phenotype/unit.obo"/>
  </cvList>
  <AnalysisSoftwareList>
    <AnalysisSoftware id="AS" version="2.2.140" uri="http://forge.fenchurch.mc.vanderbilt.edu/projects/myrimatch/">
      <SoftwareName><cvParam cvRef="MS" accession="MS:1001585" name="MyriMatch" value=""/></SoftwareName>
    </AnalysisSoftware>
  </AnalysisSoftwareList>
  <SequenceCollection>
    <DBSequence id="DBSeq_REV_sp|Q5THR3|EFCB6_HUMAN" accession="REV_sp|Q5THR3|EFCB6_HUMAN" searchDatabase_ref="SDB"/>
    <DBSequence id="DBSeq_sp|P0CG47|UBB_HUMAN" accession="sp|P0CG47|UBB_HUMAN" searchDatabase_ref="SDB"/>
    <DBSequence id="DBSeq_sp|P0CG48|UBC_HUMAN" accession="sp|P0CG48|UBC_HUMAN" searchDatabase_ref="SDB"/>
    <DBSequence id="DBSeq_sp|P62979|RS27A_HUMAN" accession="sp|P62979|RS27A_HUMAN" searchDatabase_ref="SDB"/>
    <DBSequence id="DBSeq_sp|P62987|RL40_HUMAN" accession="sp|P62987|RL40_HUMAN" searchDatabase_ref="SDB"/>
    <DBSequence id="DBSeq_zz|ZZ_FGCZCont0181|" accession="zz|ZZ_FGCZCont0181|" searchDatabase_ref="SDB"/>
    <DBSequence id="DBSeq_REV_sp|Q96AA8|JKIP2_HUMAN" accession="REV_sp|Q96AA8|JKIP2_HUMAN" searchDatabase_ref="SDB"/>
wolski commented 7 years ago

Thank you a lot for the instantaneous help. Amazing software by the way.