Closed Omikhleia closed 2 years ago
Oh and another slightly related use case for Markdown: I haven't thought to it a lot, but I am not sure how to possibly control the use of the metadata (e.g. YAML block). In some case, I could want to consider the document to be kind of "standalone" (i.e. propagating some metadata such as the author, etc.), in some other case, when including chapters in some more general document, just have them skipped. There could be other solutions here, however, that do not need changing \include
.
Yes, we can figure out something. I bumped a similar need already too. One interesting note is that this isn't just a need for \include
but also something we'll need to be able to pass from the CLI. Markdown especially is going to need help because it isn't always possible to detect the different possible flavors of markdown. Even with my initial idea of having multiple Markdown inputters using different tech behind the scenes (e.g. a markdown
vs. a commonmark
) we will still need to pass options such as in your include example. We can load an inputter with -r inputters.commonmark
and set class options with -O papersize=a6
, but we don't have a way from the CLI to set inputter options.
I suppose one way for the CLI to handle it would be have a settings or method that a chunk of Lua code could access from an -e <code>
evaluation, but that doesn't feel right.
Back to \include
, since we don't know all the options any given inputter may need might it be fair just to pass an options table rather than adding more arguments? Any arguments we don't use for all processors (src
, format
) could be stuffed in a table and passed through as a third argument.
Also do these args need to reach the :parse()
methods of the inputter or just the :process()
methods?
Also do these args need to reach the
:parse()
methods of the inputter or just the:process()
methods?
Both, it seems: only process()
is what previous SILE.readFile()
invoked - and your new functions too, but then it calls the parser which does it stuff. In my current (old-way) code for markdown, the parse
function was not even public yet (it was a local in process()
, but this is where the options would mostly be used.
It does seem to me, too, that an option table would be the way to go, as we cannot tell which options the inputter/parser may support. (I am not even sure we'd have to filter out the options.src
and options.format
, they wouldn't cause much harm)
Both, it seems: only
process()
as it is what previousSILE.readFile()
invoked - and your new functions to, but then it calls the parser which does it stuff. In my current (old-way) code for markdown, theparse
function was not even public yet (it was a local inprocess()
, but this is where the options would mostly be used.
If you look at the PR I sent to your fork, there are two parse functions in the inputter: one private with the callouts to the markdown AST writer, and one public that goes with the SILE inputter module (that calls the private one with the right data). The :parse()
method needs to be public because there are some places that use the AST without :process()
ing it. This includes tests, the content detection type snooping, and some fancy hacks I saw in other people's packages. Calling :process()
usually just calls :parse()
and then SILE.process()
on the output, but it isn't always quite that simple.
The SIL inputter also has a similar private parse function used in the public :parse()
method.
I started looking at it as this is closely related to refactoring in #1482. I see several ways to get the job done but I'm a little puzzled about what the best ergonomics would be. We need something that works both from the CLI and programmatically when loading SILE as a library.
Right now we have --use
to load up and initialize a module (class, package, inputter, whatever). Also we have --options
which passes key=value pairs as a table to the document class.
We also had an API for passing arguments for packages (used by autodoc, masters, twoside, and maybe others) but no way to pass these from either declarative markup or the CLI. It's now really obvious where to add them for declarative markup, but the CLI is less obvious.
We need a way to pass options to (at the least) inputters, classes, and packages that are specified via --use
from the CLI. I can foresee possible needs for other module types too (e.g. outputters), so something generic would be nice.
Also note inputters are not necessarily limited to one-per-document, and packages definitely are not. Classes are pretty much locked to one-per-document.
Without thinking too hard about the implementation, form an end user ergonomics standpoint how should the CLI pass options to inputters and packages that are not part of the document declarative markup?
It's been so long since I started using it, I forgot the --options
argument is new since the v0.13.x series and not yet released. That gives us the flexibility to change it still without officially breaking anything (my projects notwithstanding).
The vast majority of usage I see for this is for document class options (e.g. sile -o papersize=a6 foo.sil
to override the document paper size), but if we need to pass options to other module types maybe we should come up with something more generic before that releases.
One more thing, the evaluate option already gives one route, but that doesn't seem very ergonomic:
$ sile -e 'SILE.use("inputer.foo", { bar = "baz" })' doc.sil
Not terrible, but the whole song and dance to escape Lua code from the CLI does not feel like a nice UX to present to most end users for what may be a common usage if inputters start proliferating.
I'm still a little stuck on how to keep the CLI on parity here—the Lua API is pretty easily (SILE.use("module", { foo = "bar" })
) and the declarative markup is workable enough (\use[module=module,foo=bar]
) but what should a CLI invocation look like?
I'm considering parsing it with LPEG and allowing an options input format similar to our SIL options:
$ sile -u module[foo=bar] document.xml
Alternatively I guess I could just make sure we have a way to evaluate SIL in the same way we already have for Lua with -e
:
$ sile -s '\use[module=module,foo=bar]' document.xml
That gets kind of messy though because it is not clear how to handle the master document vs. preamable material when the preamble might have inputter options. I don't think I want to go back down that rabbit hole.
As of the current PR I'm going with the former. Examples from this issue might look like this if done from the CLI:
$ sile -u inputter.markdown[smart=false,startnum=false] foo.md
$ sile -u inputter.xml[prefix=myschema] foo.xml
The
\include
command is implemented as follows:(I am quoting, on current master at this date, that is after the nice changes added by @alerque, but the remark below also applies to earlier versions of it AFAIK).
Would it be possible for it to have extra options, passed to the underlying inputter's
process
method (which only takes the content data doc currently), which could them use them to affect its parsing logic?I am foreseeing at least two use cases for this.
\include[src=myfile.xml, prefix=myscheme]
and all tags would be prefixed (e.g.) withmyscheme:
, so that I can separate those XML tags from SILE commands, avoiding clashes (and implementing them all under that appropriate naming scheme). Say, e.g.<comment>xxx</comment>
would become\myscheme:comment{xxx}
and I don't have to temporarily switch/restore the usual\comment
.\include[src=myfile.md, smart=false, startnum=false]
This is just a low-priority "convenience" remark in passing, none of this is absolutely necessary: in the Markdown cases, these are mostly Pandoc-like extensions that seldom affect the writer's intentions, AFAIK; in the XML case, it is rather easy, as noted, to workaround it (though a bit of a challenge sometimes), and in the worst case, preprocessing the input with an XSLT stylesheet (or other solutions) is quite doable too.