Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Apparent failure to update files #32

Closed LucasHorseshoeBend closed 7 years ago

LucasHorseshoeBend commented 7 years ago

The same files I checked last night are stll showing the problems after the midnight update. The problem file you identified with the very strange table structure has been restructured, but effects are not apparent: http://vmcp.conaltuohy.com/xtf/view?docId=tei/1860-9/1862/62-07-04-final.xml

A file that had a very odd appearance in XTF--a section was coloured as if it was a footnote-- turned out to have a very odd Word styles structure: that part of the file was reported as having no style! This was redone so that the style structure was correct: http://vmcp.conaltuohy.com/xtf/view?docId=tei/1890-6/1891/91-06-29-final.xml

Two files that showed up a very strange set of symbols in a footnote were redone, by deleting and renetering the footnotes concerned: http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1887/87-06-16-final.xml http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1887/87-06-18-final.xml

There has also been, I think, another change of behaviour, but as I did not keep an exact record I can't be 100% sure, but my memory of the previous behaviour is I think pretty good because I used the characteristic now lost. Now when I enter a search term such as 87.10.26 only the file with that exact name is returned. Previously the list of selected files would also have included 87.10.26a, 87.10.26b and 87.10.26c. At present I can only get those with suffixes if I ask specifically for them, which presupposes I know they exist.

I suspect from this search case that the change is a result if the tweak to handle deleted files, and the timing suggests that the failure to load the changes is also probably something to do with that. This is of course just an hypothesis about what is happening inside what is to me a black box (I have no desire to learn how to make that box transparent to me!). I hesitated before suggesting it, just in case I started a red-herring chase, but in the end I thought it worth suggesting.

Conal-Tuohy commented 7 years ago

OK I've found the bug which I did introduce along with the file deletion functionality. It's a small thing. Trying to read a file from the wrong folder.

I will see if I can fix it now before I go to bed.

LucasHorseshoeBend commented 7 years ago

Thanks, but remember "more haste, less speed" so don't lose your sleep!

Conal-Tuohy commented 7 years ago

It's running now and it looks like it's updating the files correctly.

I had altered the bit of the pipeline that reconciled the Word documents and the OpenDocument documents that were derived from them; before, it had used to check each Word document to see if it was newer than the corresponding OpenDocument version, and if so, it would run the OpenOffice converter to regenerate the OpenDocument file, and then set its date to match the Word file's date; but I changed it so that it also ran through all the OpenDocument files, and checked that there was still a Word version of each one; if not, then it would delete the OpenDocument file. In the process, I'd buggered up all the path names by appending a base directory Dropbox/Rod shared/Mueller letters/ with the path name of each Word file, but also starting from the Mueller letters folder, i.e. Mueller letters/1860-9/1862/62-07-04-final.doc; the result was a complete path name of Dropbox/Rod shared/Mueller letters/Mueller letters/1860-9/1862/62-07-04-final.doc (which of course doesn't exist) and the upshot was that things weren't getting updated at all.

Thanks for your patience! I'll take a look at the search issue tomorrow. I don't know what would cause that, but perhaps it's another upshot of this same problem, in which case it may be solved too.

Conal-Tuohy commented 7 years ago

It looks good! The weird table is now beautifully clear.

Regarding searching, I'm puzzled as to why it would be behaving differently since I can't think of anything I've changed that would affect the search indexing. XTF does have a bunch of different search options, and I don't claim to understand them all, but in fact I wouldn't have expected the search to have returned those broader matches, as you described, unless you appended a * (meaning 'with any suffix') to your search term. e.g. search for 87.10.26* -- that at least does work as you'd hope. Will that do you? The other, "clickier" option is to use the "Date" facet and click your way through the decade, year, and month down to the day in question.

Finished my cup of tea; I'm off to bed with a good book!

LucasHorseshoeBend commented 7 years ago

I hope that it is "Good morning!" when you see this. Now working very well! thanks. Appending a wild card is fine, and beats clicking anyday.