Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Css fonts #22

Closed LucasHorseshoeBend closed 7 years ago

LucasHorseshoeBend commented 7 years ago

I see that you have been looking at font sizes, typefaces and the like. As you will have deduced some of the very infrequent ones are probably editorial errors, for exampe the

• font-style: italic; background-color: #ffff00; (1)

in http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1886/86-12-13-final.xml is certainly an error

This is useful to know for debugging. I will work on the infrequent or odd looking ones in the group shown as final in the status heading. and keep the list along side me as I work on others that are being set at final. So when you reload another edition, a lot of those should have been removed in the final set. See new issue on editions

As far as I know the only font families that should be present are Geneva and Times.

I infer this is an attempt to determine what styles should be attached to the files showing up as standard in the styles facet. Would it help if I sent the specifications for each style in the template we are using?

LucasHorseshoeBend commented 7 years ago

I could not see cases of font family Times in http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1882/82-02-00-final.xml I don't think copy had been changed since your last load. All styles appeared in the analytics as Times family; everything seemed to be in Geneva in the Word files as far back as my back up version in my backup of 10 February, except some empty ¶¶ in the header and footer in Courier Am I not seeing straight? It is not a case of styles being specified in Times and then direct formatting turning it into Geneva.

LucasHorseshoeBend commented 7 years ago

I have checked another cases using only Times http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1883/83-10-09c-final.xml and it did show up as having been directly formated as Geneva, overlying styles specified in Times, which makes me think that the former case had been corrected after the version that you were given to load.

Conal-Tuohy commented 7 years ago

Looking at the HTML source for 82-02-00-final I can see that the Times formatting has been applied only to a few spans of white space. The Times-formatted white space looks very like Geneva-formatted white space, unsurprisingly. I've changed the CSS-extraction code now so that it ignores CSS styles which are applied only to whitespace.

Incidentally, you're correct to assume this is to help understand the relationship between the named styles and formatting; I have a copy of the style guide already but I also wanted to see what was actually there.

Conal-Tuohy commented 7 years ago

It may be the case that the discrepancy is just due to my source data lagging the up-to-date data. But rather than dig into the code to try to determine this, I think we should deal with the issue by connecting dropbox up to XTF so that we do have a continuously up-to-date system, and then raise the issue again if the problem persists.

Conal-Tuohy commented 7 years ago

Also NB in http://vmcp.conaltuohy.com/xtf/view?docId=tei/1880-9/1882/82-02-00-final.xml the footnotes are in Times.

LucasHorseshoeBend commented 7 years ago

I think that the issue can now be closed: it has been extremely helpful in diagnosis, and will prove so as we finalise more files.