PathVisio / libGPML

Java library for reading / writing GPML files
Apache License 2.0
2 stars 4 forks source link

Lots of empty bp:IDs in PublicationXrefs are causing meta-data-action to fail via libGPML #22

Closed AlexanderPico closed 1 year ago

AlexanderPico commented 1 year ago

If the PublicationXref looks like either of these (and we're seeing many cases of both):

<bp:ID rdf:datatype="http://www.w3.org/2001/XMLSchema#string"></bp:ID>
or
<bp:ID rdf:datatype="http://www.w3.org/2001/XMLSchema#string"/>

Then, this error is thrown and on_gpml_change.yml fails:

org.pathvisio.libgpml.io.ConverterException: class java.lang.IllegalArgumentException: Citation must have valid xref or url, or both.
        at org.pathvisio.libgpml.model.Citation.<init>(Citation.java:60)
        at org.pathvisio.libgpml.model.PathwayElement.addCitation(PathwayElement.java:378)
        at org.pathvisio.libgpml.model.GPML2013aReader.readPublicationXrefs(GPML2013aReader.java:378)
        at org.pathvisio.libgpml.model.GPML2013aReader.readCommentGroup(GPML2013aReader.java:444)
        at org.pathvisio.libgpml.model.GPML2013aReader.readShapedElement(GPML2013aReader.java:470)
        at org.pathvisio.libgpml.model.GPML2013aReader.readDataNodes(GPML2013aReader.java:774)
        at org.pathvisio.libgpml.model.GPML2013aReader.readFromRoot(GPML2013aReader.java:131)
        at org.pathvisio.libgpml.model.GPMLFormat.readFromXmlImpl(GPMLFormat.java:262)
        at org.pathvisio.libgpml.model.GPMLFormat.readFromXml(GPMLFormat.java:195)
        at org.pathvisio.libgpml.model.PathwayModel.readFromXml(PathwayModel.java:1176)
        at meta.data.action.MetaDataExtractor.main(MetaDataExtractor.java:101)
Caused by: java.lang.IllegalArgumentException: Citation must have valid xref or url, or both.
        ... 11 more
AlexanderPico commented 1 year ago

I need help fixing this in libGPML and then producing a new meta-data-action jar, making a release and updating the GH action or cache.

AlexanderPico commented 1 year ago

Note: If the GPML is changed to use "NA", then it all works fine:

<bp:ID rdf:datatype="http://www.w3.org/2001/XMLSchema#string">NA</bp:ID>
egonw commented 1 year ago

Okay, so the problem is that empty elements cause problems. We have a lot of them indeed.

@mkutmon, what should be the libGPML behavior? Ignore the reference totally? Add an empty reference?

mkutmon commented 1 year ago

fixed #23