Open jhubert opened 7 years ago
OK, will investigate this issue and try to come with a PR. If there are any suggestions, let me know.
Btw @jhubert, what is the desired output for this? We should not have that margin at all? Because I don't see any margin when opening in libreoffice, as you mentioned in first screen. But when converting from libreoffice to html I get:
which is a little different from what we have with pydocx.
I think the desired output is that the inset matches the word document. For this simple case, that should just mean removing the margin on the inner span.
Hm, but there can be cases when we actually need this margin there?
There are definitely more complex cases, all of which I don't think are being handled properly. Here are some examples.
When the word document has this:
The HTML output is this:
In the first case, the nested list items are getting margin added to the content of each item but the bullet should be in line with the headline. Basically everything is wrong.
In the second case, the list should have a negative margin so the list items match the indent of the headline.
In the third case, the list should have additional margin so that it's inset more into the page than the headline.
I would call these more or less edge cases... the only one that really feels broken when looking at it is:
So, that's probably worth spending the most time on. If the rest of them get solved in the process, hurrah! 💯
@jhubert can you also attach .docx files with this example you mention, just to have some for tests. Thx
I just don't understand when we need to ignore this margin and when we should not. Maybe @winhamwr @kylegibson can give some advice on this.
@botzill I can't think of a time where we would want the margin next to the list item. If anything, I think there would be a case where we want the margin on the whole ul
.
When certain docx files that have adjusted margins get imported, the resulting HTML places the margin in the wrong place. This results in oddly formatted HTML.
For example, here are two lists in word:
The first list has been indented, the second one has the standard doc indentation.
Here is the result in HTML:
The resulting HTML has a span inside the
li
with amargin-left
set on it:It seems that the whole
ul
should have the margin, if anything at all.Here is the sample file: list-item-margin.docx
And here is the cleaned up docx source from the
document.xml
file:The difference seems to be the existence of the
<w:ind w:left="720"/>
value, which I'm assuming is telling pydocx to add an indentation.