Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Apparatus files #24

Open LucasHorseshoeBend opened 7 years ago

LucasHorseshoeBend commented 7 years ago

Thinking ahead a little bit.

The apparatus files contain items that will need different treatment. Initially I want to work on the two bibliographies, that containing Mueller's publications, and the editors citations. I will work first on the Mbibliography.

My understanding of the teleconfernece session Rod and I had with Gavan is that to make it possible, or at least to facilitate, linking of the citation to Mueller's publications we need essentially a discrete item for each publication, which I assume to mean a database with each publication listed as a discrete record.

We could do a cheap and nasty database by treating the Mbibliograpy as a tab delimited file and importing into a data base, as I have verified by importing a sample, saved as .txt, into Filemaker pro. That gives just two fields, the reference number and publication details. However, since I need to do a great deal of cleaning of that file, I think we can do much better, by inserting tabs in the right place to produce a tab-delimited file with the following fields:

reference number [I would add a B in front of this so that it matches exactly what is in the footnotes, and presumably would make your linking algorithm simpler] Author [Which will be Mueller or Mueller and another--in the current file no author is given unless joint, and these are coded as yy14.xx I could omit Mueller and the second field would then become additional author, but it is easy enough to insert Mueller as the author by using an all-change command so that each record is a compete bibliographic record. I assume that the original publication upon which we built omitted Mueller as "obvious" and /or to save space in a printed version] title publication details [these currently omit year of publication because that is included in the code: to complete a bibliographic record in a standard form the year would need to be inserted, but I think manually, so I will rely on tye code to give the year] notes

Before I start on this seriously, is there anything else of which I should be aware?

Conal-Tuohy commented 7 years ago

Can you clarify what you mean by "linking of the citation to Mueller's publications"? Did you have in the mind the ability to convert the citation into a hyperlink to e.g. a library catalogue record?

Conal-Tuohy commented 7 years ago

I don't think it's necessarily the case that you would need to convert the bibliography from its current form (i.e. as a text containing a bibliography) into a spreadsheet or relational database. It might depend on what you wanted to achieve by it, though.

If the bibliography is sufficiently conventional then it may be easy enough to automatically recognise the components of each citation (e.g. titles in italics) and tag them explicitly (e.g. titles using a TEI <title> element). Gavan may have mentioned a project he and I were both involved in -- the ISIS cumulative bibliography of science -- which has had TEI bibliographic markup added automatically (with a reasonable success rate).

The ISIS bibliography was on a scale that made automation of the markup the most practical approach, but in your case, it would certainly also be feasible for you to manually add the semantics you've listed above (author, reference number, notes, and various publication details) by distinctly formatting the various parts of each citation using an appropriately named "character style" in Word. These could then be converted into the equivalent TEI elements very straightforwardly (i.e. with very little work on my part). Maybe take a quick look through the 3.11 Bibliographic Citations and References section of the TEI guidelines. If you were to format the citations using styles whose names matched the TEI elements which are defined there, then that would simplify the conversion process.

Either way, continuing to work in a word-processor and producing bibliographic TEI markup would have the advantage that the bibliography would be at the same time:

LucasHorseshoeBend commented 7 years ago

The idea was to be able to link a reference given in a footnote to the corresponding entry in the bibliography, which is always of the form of a code Byy.mm.XX where XX is a running number allocated in the oprder in which we found the citations, not in date order. mm is sometimes 13, to indicate an undetermined month, and sometimes 14 to indicate a joint authorship. So fn 3 of http://vmcp.conaltuohy.com/xtf/view?docId=tei/1850-9/1853/53-02-03-final.xml contains references to two items that are listed in Appendix B Mueller bibliography (I can't generate a citation for that file)

What we want to be able to do is to generate an link internal to the site to enable the reference to be decoded quickly.

The bibliography is not as standard as I would have liked, but the project inherited the format and a published source to which we needed to preserve a correspondence. I would be surprised if it was as easy to read it in as it is, but the idea of character styling elements is very valuable, as I will need to edit the text in any case to remove infelicities and inconsistencies in the way that sources are given. I will character styling after Christmas and put a sample of entries into a folder for you to try out linking

Conal-Tuohy commented 7 years ago

I understand about the linking now. You want to be able to link from a footnote to the corresponding full citation in the bibliography.

I think it would be possible to recognise those references and convert them into a link automatically. The reference numbers in the bibliography seem pretty distinct in their formatting and placement, and the Bnn.nn.nn format used in the footnotes is also unambiguous.

LucasHorseshoeBend commented 7 years ago

I've been in London without access to www, so apologise for the delay in responding.

OK, I won't do anything with the structure of the file as I edit it, unless you tell me the link can't be made.