smooks / smooks-dfdl-cartridge

Smooks cartridge leveraging Apache Daffodil to parse files and unparse XML
https://www.smooks.org/documentation/#dfdl
Other
2 stars 3 forks source link

Nullpoint exception in org.apache.daffodil.exceptions.Abort when parsing a DELJIT message #152

Closed ChrisVanBael closed 1 year ago

ChrisVanBael commented 3 years ago

See this thread on the mailinglist: https://groups.google.com/g/smooks-user/c/BKnqqGUyX3Y

Used following dependencies:

<dependency>
  <groupId>org.smooks.cartridges.edi</groupId>
  <artifactId>smooks-edifact-cartridge</artifactId> 
  <version>2.0.0-M3</version>
</dependency>
<dependency>
  <groupId>org.smooks.cartridges.edi</groupId> 
  <artifactId>d96a-edifact-binding</artifactId>
  <version>2.0.0-M3</version>
</dependency>
<dependency>
  <groupId>org.smooks.cartridges.edi</groupId>
  <artifactId>edifact-schemas</artifactId>
  <version>2.0.0-M3</version>
  <classifier>d96a</classifier>
</dependency>

Created following method to parse DELJIT messages:

public String parseDelJit(String message) {
    final Smooks smooks = new Smooks();
    smooks.setReaderConfig(new EdifactReaderConfigurator("/d96a/EDIFACT-Messages.dfdl.xsd").setMessageTypes(Arrays.asList("DELJIT")));
    final StringWriter writer = new StringWriter();
    smooks.filterSource(new StringSource(message), new StreamResult(writer));
   return writer.toString();
}

Tried to call it with several different messages (both from our test environment which matches production, so the messages seem to be correct):

    @Test
    public void DelJitTest() throws IOException, SAXException {
        //String message = "UNH              DELJIT  D96AUNA10010BGM 3020212241720458870DTM137202106031720                       203NAD SEE521J                               92NAD BY737                                 92LOC 11086                      LOC159086                      SEQ  34570156DTM194202105280814                       203GIR  4665023107                           FYYV4162UM2M2617547                   VI1                                   LILIN30753320                            MNQTY131+000000000001DTM  2202106021720                       203UNT    12              \n";
        String message = "UNH+1+DELJIT:D:96A:UN:A10010\n" +
                "BGM+30+202106200650\n" +
                "DTM+011:202106181529:203\n" +
                "NAD+SE+BS8CB::92\n" +
                "NAD+BY+BS8CA::92\n" +
                "NAD+CN+BS8CA::92\n" +
                "LOC+11+086\n" +
                "SEQ+3+1\n" +
                "DTM+002:202106181529:203\n" +
                "GIR+4+0001:AW+4580613:AN+1:LI+T0:AT\n" +
                "GIR+7+AA00:AV\n" +
                "LIN+++1285323:IN\n" +
                "GIR+1+T21C7444425:BN\n" +
                "GIR+7+BB00:AV+VB1:ML\n" +
                "QTY+131:1\n" +
                "LIN+++6906533:IN\n" +
                "GIR+1+T4172883:BN\n" +
                "GIR+7+BB00:AV+MO1:ML\n" +
                "QTY+131:1\n" +
                "LIN+++32249466:IN\n" +
                "GIR+1+T7965124:BN\n" +
                "GIR+7+BB00:AV+PT1:ML\n" +
                "QTY+131:1\n" +
                "LIN+++32260849:IN\n" +
                "GIR+1+T1262667053:BN\n" +
                "GIR+7+BB00:AV+ACC:ML\n" +
                "QTY+131:1";
        String parsed = edifactTasks.parseDelJit(message);
    }

Gives me this stacktrace:

org.smooks.api.SmooksException: Failed to filter source
...
Caused by: org.apache.daffodil.exceptions.Abort: Invariant broken. Runtime.scala - Leaked exception: java.lang.NullPointerException
java.lang.NullPointerException
    at org.apache.daffodil.io.BucketingInputSource.fillBucketsToIndex(InputSource.scala:261)
    at org.apache.daffodil.io.BucketingInputSource.areBytesAvailable(InputSource.scala:329)
...
    at org.smooks.cartridges.dfdl.parser.DfdlParser.parse(DfdlParser.java:183)
    at org.smooks.engine.delivery.sax.ng.SaxNgParser.parse(SaxNgParser.java:86)

Full stacktrace in attached file, we use Serenity-BDD with Cucumber to run the tests. stack.txt

uliSchuster commented 2 years ago

I do get the same error message with a different setup for parsing EDIFACT messages. I have narrowed it down to the message source.

The following does work (Kotlin code):

val inputStream = FileInputStream(File(fileName))
smooks.filterSource(StreamSource(inputStream), StreamResult(outputStream))

while this fails with the above error message:

val inputStream = FileInputStream(File(fileName))
val inputString = inputStream.readBytes().toString(Charsets.UTF_8).trim()
smooks.filterSource(StringSource(inputString), StreamResult(outputStream))

However, it is not just the StringSource that causes problems. The following results in the same error:

val input  = StringReader(inputString)
smooks.filterSource(StreamSource(input), StreamResult(outputStream))

...while this solution does work:

val inputStream = inputString.byteInputStream(Charsets.UTF_8)
smooks.filterSource(StreamSource(inputStream), StreamResult(outputStream))

So, maybe the problem seems to be using a String rather than a byte stream?