DeepBlueCLtd / LegacyMan

Legacy content for Field Service Manual
https://deepbluecltd.github.io/LegacyMan/index.html
Apache License 2.0
2 stars 0 forks source link

Issue with htmlToDita - single-top-level elements #574

Closed IanMayo closed 11 months ago

IanMayo commented 11 months ago

We're now processing category tables using htmlToDita.

This has introduced a problem. In the past htmlToDita normally received a large slug of html which it processed in bulk using find_all elements.

Our new category processing logic loops through the table, and sends each element of a table cell to htmlToDita.

That means that htmlToDita is receiving individual elements like <br>, <span> and <img>.

The method isn't currently configured to check for the whole soup object just being a single span element, so it leaves it unchanged in the exported content, and DITA validate/publish both complain: image

Our span processing loops through all the span elements it can find: image

I think we need to change this processing so that if the soup is a single element that's a span (soup.name.lower() == "span") - then we create an array with the element in it. Else we do find_all, and store them in an array. Then we loop through the array.

I've just re-checked the target data, and img images are triggering this, too.

A breaking set of mock code for this is in PR #572