sile-typesetter / sile

The SILE Typesetter — Simon’s Improved Layout Engine
https://sile-typesetter.org
MIT License
1.61k stars 97 forks source link

lists package: enforceListType precludes XML handling #2073

Closed hlandau closed 2 weeks ago

hlandau commented 2 weeks ago
  self:registerCommand("ul", function (_, content)
    SILE.call("itemize", {}, content)
  end)

  self:registerCommand("li", function (_, content)
    SILE.call("item")
    SILE.process(content)
  end)
! Only 'enumerate', 'itemize' or 'item' are accepted in lists, found 'li' at:
stack traceback:
    [C]: in function 'error'
    ...xvdgvg1wcm9f8-sile-0.14.13/share/sile/core/utilities.lua:42: in function 'core.utilities.error'
    ...g1wcm9f8-sile-0.14.13/share/sile/packages/lists/init.lua:76: in upvalue 'enforceListType'
    ...g1wcm9f8-sile-0.14.13/share/sile/packages/lists/init.lua:177: in local 'func'
    ...bxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/settings.lua:208: in function 'core.settings.temporarily'
    ...g1wcm9f8-sile-0.14.13/share/sile/packages/lists/init.lua:157: in function 'packages.lists.doNestedList'
    ...g1wcm9f8-sile-0.14.13/share/sile/packages/lists/init.lua:267: in field '?'
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:410: in function 'core.sile.call'
    ./ddbook.lua:236: in field '?'
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:410: in function 'core.sile.call'
    ...
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:245: in function 'core.sile.process'
    ./ddbook.lua:70: in field '?'
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:410: in function 'core.sile.call'
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:245: in function 'core.sile.process'
    (...tail calls...)
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:311: in function 'core.sile.processString'
    ...wgxlbxvdgvg1wcm9f8-sile-0.14.13/share/sile/core/sile.lua:351: in function 'core.sile.processFile'
    ...bgky8dwyr73jjm-lua-5.3.6-env/share/lua/5.3/pl/tablex.lua:351: in function 'pl.tablex.imap'
    ...e/hsf0naadybvws3wgxlbxvdgvg1wcm9f8-sile-0.14.13/bin/sile:56: in function <...e/hsf0naadybvws3wgxlbxvdgvg1wcm9f8-sile-0.14.13/bin/sile:55>
    [C]: in function 'xpcall'
    ...e/hsf0naadybvws3wgxlbxvdgvg1wcm9f8-sile-0.14.13/bin/sile:55: in main chunk
    [C]: in ?

error summary:
    Processing at: input.xml:502:12: in \itemize
    Using code at: ...g1wcm9f8-sile-0.14.13/share/sile/packages/lists/init.lua:76: Only 'enumerate', 'itemize' or 'item' are accepted in lists, found 'li'
Omikhleia commented 2 weeks ago

I could say that it does not really "preclude XML handling"... but just that your "ul" command could be implemented differently e.g. looping on its content to rebuild a suitable list of nodes. Whether this is a convenient approach is left to the package developer.

alerque commented 2 weeks ago

I agree the current forced tag name situation is not ergonomic and precludes pretty standard things we encourage in other areas (like creating a new command that wraps another one with extra handling pre-post). I ran into this problem myself and it made for some pretty awkward special casing. I'd like to see a better solution, perhaps even just reducing the current error to a warning and letting any breakage caused by incompatible content be allowed. Or perhaps allowed with some sort of override mechanism. Perhaps the package could take an initialization option that notes what commands are expected to behave like lists and list items.

Omikhleia commented 2 weeks ago

But I'm seeing that the lists package was monkeypatched to support BulletedList and OrderedList from the pandoc package -- breaking all separation of concerns on the way. So I beg to disagree.

Omikhleia commented 2 weeks ago

Perhaps the package could take an initialization option that notes what commands are expected to behave like lists and list items.

What makes you think that lists and list items have the same structural approach in any XML schema -- That would make such an option any useful?

alerque commented 2 weeks ago

The support for the Pandoc AST elements is not exactly a monkey patch, it is just first class support for a second possible AST node name. The core packages we distribute have lots of features built in explicitly to enable things in other packages (which is not great separation of concerns but technically not monkey patching either). The interfaces are gradually getting cleaner so that instead of say the list package just playing nice with the pandoc package, it should provide an API by which any 3rd party package could accomplish the same thing and register its own list-compatible commands. Monkey patching would have been the pandoc package (or a 3rd party one) overriding the functions already loaded by the lists package to provide the same function with its own bits added, thereby making the actual implementation of item or whatever come from the pandoc package instead of lists. SILE allows monkey patching (and incidentally that would be one other way to solve @hlandau's issue without changing their input document or having to generate ASTs on the fly) but we don't really have to do that in the core package set since we have access to both sides.

The AST used in Pandoc is a fully compatible subset of one of the two semantic structures allowed by the list package. <ul> and <li> elements in HTML are required to work close enough to the same way to also be an appropriate substitute. A more general approach of an HTML tags package that covered more tags and their SILE command equivalents would be a worthwhile endeavor, only supporting these two elements is kind of arbitrary.

There likely are other XML structures that are not compatible at all, but the vast majority I've seen that use ul/ol/li etc. are basically just slices out of the HTML spec dropped into some other larger XML schema.

Omikhleia commented 2 weeks ago

I beg to disagree even further, but this is not the right place for discussing it.

Omikhleia commented 2 weeks ago

Anyway, in the meantime, @hlandau, here is a possible implementation of "ul" working with the current code base:

<document>
<use module="packages/lists"/>
<lua>
local class = SILE.documentState.documentClass
class:registerCommand("ul", function (_, content)
   local items = {}
   for i = 1, #content do
     if type(content[i]) == "table" and content[i].command == "li" then
       content[i].command = "item"
       table.insert(items, content[i])
     end
   end
   SILE.call("itemize", {}, items)
end)
</lua>

<ul>
  <li>Item
    <ul>
      <li>Subitem</li>
    </ul>
  </li>
  <li>Item</li>
</ul>

</document>
Omikhleia commented 2 weeks ago

N.B. The performance impact on having to copy the content list is likely fairly insignificant with usual lists of items. The solution above is therefore a fairly obvious workaround that does not require any monkey-patching of the existing code base. As soon as you tackle with real-world XML schemas, you'll realize that AST munching of the content is often unavoidable, so it's not that bad a solution either -- e.g. neither DocBook lists nor TEI lists follow the HTML pattern, so I really wonder which "vast majority" of schemas do, out of curiosity.)

hlandau commented 2 weeks ago

@Omikhleia This works. Thank you!