Closed SebRollen closed 2 years ago
I tried updating the line to also allow _block
elements as children of div
tags. This let me generate the word doc, but I ran into corruption issues when opening the document. Would appreciate some pointers to understand why this would not work
Hi @SebRollen, it's been a long while since I worked on Sablon. I don't have any direct pointers for you where to look but I can tell you that Sablon by no means tries to implement complete compliance with the HTML standard. HTML insertion is a compromise and contains lots of trade-offs. It is certainly not in a state where it could render any valid HTML you give it.
Without having looked closer, I'd assume that block elements can not be nested in WordML. As both div
and p
tags translate to a wordML block, this results in the corruption you observe.
I would recommend that you transform the html before you feed it to Sablon.
@SebRollen that is expected behavior, WordML has more rules and is less forgiving than HTML so some concessions needed to be made to allow mapping from HTML -> WordML. I suppose one could tweak the code to treat any child of a block tag as an inline tag then you'd be allowed to do <div><p>[content]</p></div>
and it'd render as if it was <div><span>[content]</span><div>
. I haven't looked at the code in ages but I suspect there might be a few other "gotchas" adding in that flexibility.
@senny @stadelmanma Thank you both, I think I was focusing too much on the HTML side of things rather than the WordML standard - just because something is allowed from one side obviously doesn't mean there's a direct mapping to the other standard.
We've found a way to work around this issue by tweaking our HTML slightly, so I'll close the issue as this doesn't necessarily seem like something that could or should be fixed in Sablon.
Thanks again!
Ran into this issue today, where sablon would not convert html where a
p
tag was a child element of adiv
tag. This was surprising to me, but I can see that the config file indeed says thatdiv
tags can only holdinline
elements: https://github.com/senny/sablon/blob/master/lib/sablon/configuration/configuration.rb#L63This seems inconsistent with the HTML standard which lists that any
flow content
(which includesp
) can be a child ofdiv
tags: Div: https://html.spec.whatwg.org/#the-div-element