Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Timing problem with connections?? #18

Closed LucasHorseshoeBend closed 7 years ago

LucasHorseshoeBend commented 7 years ago

It has just struck me that we might have a potential problem working as we are, but I also hope/think that it will not turn out to be a problem.

When, as you have been doing, you fix the problem cases on files we have not yet finished working on, we will lose the facility of identifying those with problems and might miss them as we finalize them.

For example, http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1887/87-08-17b.xml as it is in your files at the moment has had a table introduced by your processing that was not there in the original source file. So it will not appear as a problem case anymore with even tabulation. This file has other problems, standard style instead of letter style in part of the text, and uneven tabs but I will not change it at this stage until you look at it and compare with the source file to see what I mean.

Perhaps this will not be a problem, as even if we missed fixing it, it would be fixed again as it is reloaded?? Am I right?

If I am not right, the solution would be to work on just those files that have the suffix -final.xml. I hope I am right, because we should then be in a position to catch every problem when it is loaded as final, even if we as editors miss correcting it.

Conal-Tuohy commented 7 years ago

I don't think this is a problem; but I think you are operating under a misapprehension. The table visible in the XTF page you cited is not one introduced by my processing. In the version of the Word document which I have it is already a table. In fact I haven't yet implemented the processing step which will convert tabular layouts into tables, so currently any table visible in XTF is an expression of a Word table.

The document is tagged with "uneven tabulation" because of the lines at the end, beginning "Proposed" and "Elected", because they are consecutive paragraphs containing tab characters (other than at the start of the paragraph), but where the number of tab characters is not the same (because the word "Elected" was slightly shorter than "Proposed", it needed one more tab character to align the subsequent date with the date in the row above).

So I don't see any reason for concern, and I think you should feel free to edit any of the files whenever you need to.

LucasHorseshoeBend commented 7 years ago

Thanks. I will not stall on this