... or at least move a considerable part of the code to Java.
The idea is to gradually implement more and more steps in Java, starting with the most performance critical ones. Not starting completely from scratch has the benefit that it keeps the refactoring manageable: we don't have to rethink the whole program flow, we can do it step by step without breaking anything, we can reuse all the existing unit tests, etc.
Another reason for choosing this solution is because I don't want to abandon the XProc idea completely. Thinking of the converter in terms of "steps" and "pipelines" is still very useful, and the XML representations of all the intermediate documents are very useful for visualizing tests for example. The only problem is that XML processing is not always the most efficient way to implement something.
Therefore I would like to move to a situation where the high-level program flow (the pipeline) is still written in XProc, and unit tests in XProcSpec, but internally we use less and less XML. This will be possible in XProc 3.0, when documents do not have to be XML anymore. They can be plain text files, binary files, or anything else, so also Java objects. This way it becomes possible to do inter-step optimizations between Java steps, and build a full-fledged Java program within an XProc framework. (In fact, with some hacks I have found a way to use random Java objects as documents in Calabash 1 already, so we don't necessarily need to wait for XProc 3.0.)
Note that in order to support "Java documents" on inputs of "XML steps" like p:xslt, and to support XProcSpec testing of "Java steps", the Java documents should be convertible to XML and back. This obviously requires some extra work but leads to more usable and better testable code.
This idea is not necessarily limited to css-to-obfl. There could in theory be a "Java link" from the CSS parsing all the way up to the Dotify step.
... or at least move a considerable part of the code to Java.
The idea is to gradually implement more and more steps in Java, starting with the most performance critical ones. Not starting completely from scratch has the benefit that it keeps the refactoring manageable: we don't have to rethink the whole program flow, we can do it step by step without breaking anything, we can reuse all the existing unit tests, etc.
Another reason for choosing this solution is because I don't want to abandon the XProc idea completely. Thinking of the converter in terms of "steps" and "pipelines" is still very useful, and the XML representations of all the intermediate documents are very useful for visualizing tests for example. The only problem is that XML processing is not always the most efficient way to implement something.
Therefore I would like to move to a situation where the high-level program flow (the pipeline) is still written in XProc, and unit tests in XProcSpec, but internally we use less and less XML. This will be possible in XProc 3.0, when documents do not have to be XML anymore. They can be plain text files, binary files, or anything else, so also Java objects. This way it becomes possible to do inter-step optimizations between Java steps, and build a full-fledged Java program within an XProc framework. (In fact, with some hacks I have found a way to use random Java objects as documents in Calabash 1 already, so we don't necessarily need to wait for XProc 3.0.)
Note that in order to support "Java documents" on inputs of "XML steps" like
p:xslt
, and to support XProcSpec testing of "Java steps", the Java documents should be convertible to XML and back. This obviously requires some extra work but leads to more usable and better testable code.This idea is not necessarily limited to css-to-obfl. There could in theory be a "Java link" from the CSS parsing all the way up to the Dotify step.
Related: