snaekobbi / issues

Common issue tracker for the Braille in DAISY Pipeline 2 project
0 stars 0 forks source link

Alternative path in dtbook-to-pef (bypassing CSS) #17

Closed bertfrees closed 7 years ago

bertfrees commented 9 years ago

In this issue we'll describe what needs to be done in order to expose an alternative path in the system that goes directly from DTBook (or other XML) through OBFL to PEF, without taking into account any CSS style sheets. Possibly an XSLT style sheet (DTBook to OBFL) that would be provided as an input could be evaluated on the fly.

Note: This feature is absolutely required by MTM. It would be available for others too, although I believe that it is in the interest of this project to stick with the CSS path. (For the 5 other agencies as well as for DP2 in general, and for the sake of configurability as well as for maximising the possible areas of collaboration.)

josteinaj commented 9 years ago

Would this be a matter of just removing

... in a preprocessing XSLT?

bertfrees commented 9 years ago

No the idea is basically that the XSLT style sheet would replace the CSS style sheet (behind the scenes it replaces the xml+css-to-obfl step). What you would define in CSS would need to be defined in XSLT. The XSLT may (partly) take into account CSS, but it probably won't. MTM will use this path to plug in their existing dtbook-to-obfl.xsl.

joeha480 commented 9 years ago

I think that some integration with css in the XSLT path would be a great feature (though not required for now). In other words, you would define the main transformation with XSLT, but perhaps it would be useful to do per book customization with css on top of that. I realize that it might be difficult to figure out how to use them together, but it would be a cool feature.

Also, we need to note that:

  1. there is a bunch of parameters used with the XSLT, that has to be injected somehow
  2. there is more than one XSLT used together (via xsl:import, they are not executed in sequence)

What about special input format restrictions check? I.e. disallow table elements. That is something we currently have and it would be nice to have in the new system as well.

What do you think about this, @BWestling?

josteinaj commented 9 years ago

there is a bunch of parameters used with the XSLT, that has to be injected somehow

The Pipeline 2 Web API currently does not support XProc parameter ports; and as a consequence all parameters you want to pass to the XSLT from the end-user has to be declared explicitly beforehand and exposed as script options.

bertfrees commented 9 years ago

@joeha480 Based on your thoughts I suggest we do it as follows, and I also add an alternative which may be a bit better in terms of user interface:

  1. Because you may want to take into account a subset of CSS, I suggest you just use the existing framework (dtbook-to-pef) which basically consists of a css:inline step and then the formatting step (incl. translation) which is selected based on the transform option. So what I would do is create a custom formatter step (that would e.g. be selected with the query "(formatter:mtm)"). This step (currently must be implemented in XProc) would consist of your XSLTs and the dotify:format step. Taking into account only a subset of CSS is perfectly possibly, as long as it is explained in the documentation of your custom formatter. For injecting parameters the transform query would be used. So the whole query would look something like "(formatter:mtm)(paramX:valX)(paramY:valY)". I like this format for specifying options because it is generic (not specific to XProc, or XSLT, or anything, which are implementation details) and you can have key-value pairs as well as value-less parameters like "(paramZ)". The "(...)" in the query format are actually called "features", not "parameters", so ideally each parameter should reflect some feature, but this is not so important. Your custom formatter will need some Java code to convert the query format into XProc/XSLT parameters.
  2. The only downside I can see about this approach is that the parameters can't be declared and documented in such a way that the web UI can present it to the user in some nice way. So you have to rely on proper documentation. The alternative is to create a custom script specially for you that has all your parameters declared and documented in XProc.
bertfrees commented 7 years ago

We ended up adding a xml-to-obfl step (commit https://github.com/daisy/pipeline-mod-braille/commit/6bcf27c1231c374df32faf282d4ece44a36e5b4b) backed by the OBFL "task system". Later it was changed to file-to-obfl so that it could also take EPUBs.

In parallel, we made the "stylesheet" option more powerful by allowing XSLT in addition to CSS (commit https://github.com/daisy/pipeline-mod-braille/commit/1aaf498bedd4a485e3df9280b9f6347c6c74215b).