Open thilomaurer opened 1 year ago
Raw HTML is always parsed as a RawInline or RawBlock element in the AST in pandoc's markdown variants.
I suppose we could look at an exception for gfm, which lacks any native syntax for sub/superscript. Because this is based on commonmark, it would require a supplementary filter through inline lists.
Indeed, you could implement this with a Lua filter.
function Inlines(ils)
local result = {}
local openers = {}
for _,il in ipairs(ils) do
local html = il.t == "RawInline" and il.format == "html"
local new = nil
if html and (il.text == "<sup>" or il.text == "<sub>") then
table.insert(openers,{il,{}})
elseif html and il.text == "</sup>" and openers[#openers] and openers[#openers][1].text == "<sup>" then
local contents = table.remove(openers)
new = pandoc.Superscript(contents[2])
elseif html and il.text == "</sub>" and openers[#openers] and openers[#openers][1].text == "<sub>" then
local contents = table.remove(openers)
new = pandoc.Subscript(contents[2])
else
new = il
end
if new then
if #openers > 0 then
table.insert(openers[#openers][2], new)
else
table.insert(result, new)
end
end
end
while #openers > 0 do
local contents = table.remove(openers)
table.insert(result, contents[1])
for _,il in ipairs(contents[2]) do
table.insert(result, il)
end
end
return result
end
Converting subscripts and superscripts from GFM to any fails with the exception of HTML as target. Is seems the tags
<sub>
and<sup>
are not properly parsed into the AST, as can be seen below for the native target.GFM input
gfm.md
Exsamples:
Pandoc version