TheJacksonLaboratory / LIRICAL

LIkelihood Ratio Interpretation of Clinical AbnormaLities
https://thejacksonlaboratory.github.io/LIRICAL/stable
Other
22 stars 11 forks source link

Problem parsing phenopacket #454

Closed justaddcoffee closed 4 years ago

justaddcoffee commented 4 years ago

Getting a parse error when processing a phenopacket from the wild: Willoughby-2018-WDR45-proband.txt

Possibly the problem field datasetId is illegal, but we should possibly still recover gracefully if possible.

/Library/Java/JavaVirtualMachines/jdk1.8.0_131.jdk/Contents/Home/bin/java -Dfile.encoding=UTF-8 -jar /Users/jtr4v/IdeaProjects/LIRICAL/target/LIRICAL.jar phenopacket -p /Users/jtr4v/IdeaProjects/LIRICAL/example_data/pps/splicing/Willoughby-2018-WDR45-proband.txt -o /Users/jtr4v/IdeaProjects/LIRICAL/example_data/output/ -x Willoughby-2018-WDR45-proband
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/jtr4v/IdeaProjects/LIRICAL/target/LIRICAL.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/jtr4v/IdeaProjects/LIRICAL/target/lib/log4j-slf4j-impl-2.12.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
com.google.protobuf.InvalidProtocolBufferException: Cannot find field: datasetId in message org.phenopackets.schema.v1.core.Individual
    at com.google.protobuf.util.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1417)
    at com.google.protobuf.util.JsonFormat$ParserImpl.merge(JsonFormat.java:1377)
    at com.google.protobuf.util.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1896)
    at com.google.protobuf.util.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1587)
    at com.google.protobuf.util.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1419)
    at com.google.protobuf.util.JsonFormat$ParserImpl.merge(JsonFormat.java:1377)
    at com.google.protobuf.util.JsonFormat$ParserImpl.merge(JsonFormat.java:1259)
    at com.google.protobuf.util.JsonFormat$Parser.merge(JsonFormat.java:407)
    at org.monarchinitiative.lirical.io.PhenopacketImporter.fromJson(PhenopacketImporter.java:60)
    at org.monarchinitiative.lirical.cmd.PhenopacketCommand.run(PhenopacketCommand.java:180)
    at org.monarchinitiative.lirical.Lirical.main(Lirical.java:149)
Exception in thread "main" org.monarchinitiative.lirical.exception.LiricalRuntimeException: Could not load phenopacket at /Users/jtr4v/IdeaProjects/LIRICAL/example_data/pps/splicing/Willoughby-2018-WDR45-proband.json
    at org.monarchinitiative.lirical.io.PhenopacketImporter.fromJson(PhenopacketImporter.java:65)
    at org.monarchinitiative.lirical.cmd.PhenopacketCommand.run(PhenopacketCommand.java:180)
    at org.monarchinitiative.lirical.Lirical.main(Lirical.java:149)

Process finished with exit code 1
pnrobinson commented 4 years ago

I think that this is an "old" phenopacket schema version--there were a few backwards -incompatible changes. HPO Case annotator stores data in an internal format that can be exported to the current version, so probably just re-export. Where does this phenopacket come from?

justaddcoffee commented 4 years ago

Ah okay, thanks. This phenopacket came from Daniel. It's easy enough to fix, just trying to find and eliminate some pain points that users might encounter.

We possibly could incorporate Jules' phenopacket validator to alert users about these formatting issues

pnrobinson commented 4 years ago

I do not think the phenopacket validator is up and running yet. There is this: https://github.com/pnrobinson/phenotools but it is also rough around the edges...

justaddcoffee commented 4 years ago

Okay, duly noted. Maybe for now we could emit an error message that alerts the user that the phenopacket might be malformed? Glad to do a quick PR for this

pnrobinson commented 4 years ago

I am cleaning up and added a simpler error message for this! This should not happen out in the wild since nobody but our group will have any old format packets.