JonathanReeve / sanger

Margaret Sanger Papers Project Search Engine
0 stars 3 forks source link

XSLT doesn't handle list items very well #16

Open JonathanReeve opened 11 years ago

JonathanReeve commented 11 years ago

The example for this behavior is document 101920, "Dutch Methods of Birth Control," which has a list which I'm guessing starts at "Vital Statistics of Chief Dutch Towns." In the XML, the lines in this list are marked up with <lb>, but not <list>. I could add some styling to make the list items prettier, like indentation, bullet points, etc, but I wouldn't want to apply that style to all lines with <lb> elements, since those are used in non-list contexts, as well. I could style <list> and <item> elements, but there are some lists that occur in prose, where the list items don't necessarily have line breaks, like in document 128168, around "Harold J. Cox." Another interesting case is document 129332, "To Mothers, Our Duty." Around the line "Under Under Under," there is an area which uses both <list> and <lb> tags, yet in this case, they're seemingly used to indicate some kind of tabular data, where maybe <table> would be better.

CathyHajo commented 11 years ago

Table sounds like a better choice for the lists that appear in tablular format. Is there an easy way to find all the docs with list and give them a check to see what kind of list they are?

JonathanReeve commented 11 years ago

Sure, grepped it and wrote the output here: https://github.com/JonathanReeve/sanger/blob/master/list-contexts.txt

CathyHajo commented 9 years ago

OK, this is one where we will have to change the encoding, creating tables and lists. We may need an updated list from the newest version of the documents.