Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Editions #23

Closed LucasHorseshoeBend closed 7 years ago

LucasHorseshoeBend commented 7 years ago

I suggest that we do not load another edition from the dropbox into the pipeline until the new year. We will have a number of isues clarified/corrected in the final set by then, and there will likely be a bit of a slow down if Rod takes time off in January. I probably won't let up as the weather is more conducive to desk work than even garden work!

If you are planning to have a break, the best time to reload would be when you get back to this project.

Conal-Tuohy commented 7 years ago

Hi Arthur!

Rather than follow a manual process, as I've done so far, to update XTF with a new edition of the source files, I had intended to connect up dropbox as the input to the automated pipeline (and in fact I'm already part way through setting this up). The idea being that the whole conversion and XTF reindexing can then be scheduled to run automatically, as often as we like (e.g. nightly), to keep XTF up to date, without any expenditure of effort each time. It's a bit of work to set up, but I estimate it will be less work than doing 2 further manual "editions", and it will give us instead a "continuous delivery".

LucasHorseshoeBend commented 7 years ago

If it is that easy, then do it as a continual flow. But what will happen when I am working on the files when you are sleeping?

Conal-Tuohy commented 7 years ago

I've noticed you're a night owl! ;-)

I'm not 100% sure about DropBox's synchronisation, but I expect that the state of each of the synchronised files should at least be consistent at all times (i.e. the pipeline won't be fed a "half-synchronised" file), if that's what you're worrying about. Worst case is that the file will disappear from XTF for a while and come back again later when synchronisation is finished.

LucasHorseshoeBend commented 7 years ago

Thats reassuring. I was more concerned about disrupting the transfer process. So go ahead. I am a bit of a night owl, plus of course the time difference between Brisbane and SW England.

LucasHorseshoeBend commented 7 years ago

It would be useful to update the edition, using the current dropbox letters folder fairly soon. We have cleaned many of the files identified by your exceptionally useful analytic displays, and I would like to be sure that we have been effective.

Conal-Tuohy commented 7 years ago

Taking a bit longer because I realised that I needed to reinstall everything on a new server :-(

I'd installed XTF on a server running Amazon Linux, but it turns out Amazon Linux isn't compatible with OpenOffice, which I need in order to automate the conversion of the Word documents into OpenDocument format; the first step in the conversion process to TEI. It turns out Amazon Linux is deliberately lacking in some of the subsystems which OpenOffice depends on, because Amazon Linux is an operating system optimized exclusively for internet servers, whereas OO is primarily a desktop application.

So I've created a new VM, based on Ubuntu Linux, and installed the software there. I have set up Dropbox and got it to download the documents, and I've set up the conversion pipeline. I have to finish configuring XTF, and configure the web server, and we'll be back in business.

Once the new server is running, I'll reassign the domain name vmcp.conaltuohy.com to the new server and shut the old one down.

Conal-Tuohy commented 7 years ago

The documents are converting now, while I go out to the movies!

LucasHorseshoeBend commented 7 years ago

Thanks. Hope the movie was up to expectations!

Conal-Tuohy commented 7 years ago

Movie was OK (Rogue One). Documents still converting! It's got up to 1887.

Conal-Tuohy commented 7 years ago

I have finished setting up the new server and switched the domain name vmcp.conaltuohy.com over to point to the new server. I've shut the old server down (though I haven't killed it off altogether yet). Let me know if anything seems to have gone bung, but all going well, we should see the site automatically update itself daily (and I can do it more frequently if that's a help). I'm closing this ticket now, but feel free to reopen it if the automation isn't working, or create a new ticket if some previously-existing feature has gone missing in the process of migrating to the new server.