xproc / 3.0-steps

Repository for change requests to the standard step library and for official extension steps
10 stars 7 forks source link

dfdl:parse and dfdl:unparse #608

Open JDziurlaj opened 2 months ago

JDziurlaj commented 2 months ago

I would like to see the addition of steps for parsing (*->XML) and unparsing (XML->*) using Data Format Description Language (DFDL), particularly through Apache Daffodil or similar implementations, into XProc 3.0. DFDL is a framework for defining data formats, allowing the transformation of binary and textual data into various representations (usually including XML). The integration of DFDL support into XProc 3.0 would enhance its ability to handle complex data formats, making it a more versatile and robust tool for data processing pipelines.

DFDL uses XSD 1.0 schemas as the basis of its format. Inputs can be binary/text. The DFDL processor uses the schema to produce a DFDL infoset, which processors can present in various formats (XML, JSON, etc.)

An extension for Calabash exists, and could form the straw-man basis for a more formal proposal.