Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Handle tables with formatting and column-spanning cells #12

Closed Conal-Tuohy closed 3 years ago

Conal-Tuohy commented 7 years ago

e.g. the table in 44-08-04 should look like

table

Conal-Tuohy commented 7 years ago

Clearly we need to capture the grid-lines from the Word tables that have them, and display them in XTF.

I would say that where a table has no explicit cell borders, it should still be displayed with some faint, purely indicative gridlines, in XTF. In the screenshot above (which I took of part of the page as displayed in OpenOffice Writer), there are faint gridlines that to me look like they're grey or maybe they are dotted? Should we display something like that? It seems to me valuable to have something purely to be able to check the layout of the table. Thoughts?

NB at present the cell borders are not being captured in the TEI (the conversion process ignores them). All tables are therefore displayed identically in XTF. The actual formatting used is not even specified by XTF, so the cell borders you see are defaults supplied by whatever browser you're using.

LucasHorseshoeBend commented 7 years ago

The actual formatting used is not even specified by XTF, so the cell borders you see are defaults supplied by whatever browser you're using.

That is interesting. Both Safari (which is not my normal browser since my OS is not the most recent so this is not routinely updated with fixes anymore) and Firefox (which I kep as up-to-date as my OS allows) show a similar layout as shown for the Safari version.

tables display safari

LucasHorseshoeBend commented 7 years ago

I thought I had sussed out how to control tabular layout, but have a look at http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1887/87-08-17a-final.xml as an example of a problem I have not solved: the ditto marks should be in the RH column. The column holding the brace is being treated as a column only in the top row. Advice?

Conal-Tuohy commented 7 years ago

I can see the document in XTF, and it certainly does look wrong. But I can't see the source (doc) file in my local dropbox yet. I am syncing now and I'll take a look when it's done.

Conal-Tuohy commented 7 years ago

OK there was certainly something wrong in the conversion from OpenDocument to TEI: the attribute of the cell which specified how many rows it spanned was ignored. So the TEI table was broken. I've fixed that now, and re-running the TEI conversion.

Conal-Tuohy commented 7 years ago

That seems to have sorted the row-spanning cells! I also added multi-column cells and multi-row cells as "features"

Incidentally I spotted a very strange looking table here: http://vmcp.conaltuohy.com/xtf/view?docId=tei/1860-9/1862/62-07-04-final.xml;chunk.id=main;toc.depth=1;toc.id=;brand=default -- I'm not sure what to make of it.

LucasHorseshoeBend commented 7 years ago

Thanks; that has sorted the cell problem.

http://vmcp.conaltuohy.com/xtf/view?docId=tei/1860-9/1862/62-07-04-final.xml was set out by colleagues to preserve appearance not logic! We had not looked at for two years. I will rework it, but it will need a bit of thought. I now need to use the new feature category to check what we have otherwise set at final to pick up other cases like it.

LucasHorseshoeBend commented 7 years ago

I think this is now under control, and cases are detectable through the diagnostics. From my point of view the issue can be closed. I've almost finished dealing with then=m nthe "final" set, and know how to handle others as they come up as we finalize our editing.

From my point of view the issue can be closed.

Conal-Tuohy commented 7 years ago

The grid layout is now correct, but do you not also want the table gridlines converted? At the moment the lines themselves are lost in the conversion.

LucasHorseshoeBend commented 7 years ago

Yes, that is needed. Apologies for not reading back to beginning of the issue.

Conal-Tuohy commented 3 years ago

Checked and verified the table outlines are now being captured, e.g. http://vmcp.conaltuohy.com/xtf/view?docId=tei/Mueller%20letters/1870-9/1871/71-05-00a.xml;chunk.id=main;toc.depth=1;toc.id=;brand=default so belatedly closing this issue