apache / daffodil-vscode

Apache Daffodil™ Extension for Visual Studio Code
https://daffodil.apache.org/
Apache License 2.0
11 stars 20 forks source link

root element for debug must be specified by name and optional namespace with primary schema file from a jar #653

Closed mbeckerle closed 10 months ago

mbeckerle commented 1 year ago

At Owl we have DFDL schemas made of numerous schema components, where the starting root element of the schema isn't even part of the schema, it's part of a component schema that is in a jar on the classpath.

This happens when you have the very common envelope-payload idiom. There is a separate schema for the envelope (or header), and a separate schema for the payload.

A third assembly schema that combines the envelope and payload together contains only the small glue part that connects the two together. It also contains tests of the assembly schema. For example it defines an undefined part of the envelope schema to contain an instance of the payload schema.

The root element for parsing/unparsing is the envelope's root element, and that schema is in a jar file. That root element isn't part of the assembly schema at all.

I did not see any feature in launch.json for specifying the name/namespace of the root element. There is just the "program" element, which must be to a file in the current schema.

The Daffodil library API allows you to specify a root element, and optionally a namespace URI for it. If the root element name is unambiguous the namespace URI is optional. If neither is provided then the primary schema file's first element declaration is assumed to be the root element.

Right now the VSCode extension is assuming this heuristic is the only way a root element can be specified.

The root needs to be able to be specified just as element name + optional namespace, where the DFDL schema for that root element name and namespace must be found on the daffodilDebugClasspath.

The primary DFDL Schema file that contains that element must be able to be a file in a jar on the daffodilDebugClasspath.

arosien commented 11 months ago

Perhaps we can keep the "program" launch property to be the file we examine, and then add optional launch properties for specifying this case where we are digging into the envelope's structure with provided element name and optional namespace.

arosien commented 11 months ago

I think we just need advice from @mbeckerle on what envelope we can test with (how to build one, etc.).

shanedell commented 11 months ago

@mbeckerle Could we get advice on what envelope we can test test with how @arosien mentions? Can you provide some technical details as to how Daffodil handles this so that we can have a better idea of how to tie this into the debugger?

mbeckerle commented 11 months ago

I am creating an envelope-payload example schema which will use mil-std-2045 headers with 'binary file" payloads containing PCAP data as the payload.

The PR is here: https://github.com/DFDLSchemas/envelope-payload/pull/1

shanedell commented 11 months ago

@mbeckerle Thank you for that. For https://github.com/DFDLSchemas/envelope-payload/pull/1, what is the program file or the file we will inspect, what is the root and what is the data file? Here is what I assume based on the repo

{
    "program": "./src/main/resources/org/apache/daffodil/example/envelopepayload/xsd/message.dfdl.xsd"
    "data" : "./src/test/resources/org/apache/daffodil/example/envelopepayload/test_01.dat"
    "root": ??
}

Let me know if those are correct and/or the correct values for all three.

How, do you feel about keeping program as the name for the file we inspect or do you think this needs changed to something like schema?

mbeckerle commented 11 months ago

The root element name is "message". There is a root namespace, but message is unambiguous alone. The others are correct.

I prefer that the schema is identified by 'schema'. The 'program' term suggest you want the path to 'daffodil' to me.

To make this work this schema requires two other schemas, one of which requires yet a third schema, see the README.md.

I am trying to push them all to maven central so that one needn't check them out and publish them locally.

mbeckerle commented 11 months ago

I just realized this schema doesn't exercise one of the motivating cases for this ticket. It doesn't have a component it uses that provides the root element. It has its own element declaration of the root element. I'll see about fixing that.

shanedell commented 11 months ago

@mbeckerle Sounds good, I will keep an eye out here for any updates. Also, would it be okay to create a separate ticket to address renaming program to schema? I think it would be better to have that in a separate PR as that might touch a good bit of files.

mbeckerle commented 10 months ago

The schema that (1) has no root element itself (the root element is in a component schema) (2) just glues together other schema components.

See https://github.com/DFDLSchemas/envelope-payload

these 3 schemas are used by it:

https://github.com/DFDLSchemas/tcpMessage https://github.com/DFDLSchemas/mil-std-2045 https://github.com/DFDLSchemas/PCAP

PCAP in turn uses this schema:

https://github.com/DFDLSchemas/PCAP

shanedell commented 10 months ago

So @mbeckerle when looking at the Scala code I am seeing a optRootName and optRootNamespace do we want a config value for each? So rootName and rootNamespace and if they aren't set default them to "", then in the Scala default it to None?