srophe / britishLibrary-data

GNU General Public License v3.0
0 stars 3 forks source link

Remove @ref, persName, placeName etc. in Titles? #1512

Open davidamichelson opened 2 months ago

wlpotter commented 2 months ago

Cf. #1524

dlschwartz commented 2 months ago

@davidamichelson Here are some things to help as we address this issue.

Elements currently appearing in Title:

davidamichelson commented 1 month ago

Keep: bibl (pending review) choice foreign locus quote (we reviewed this, sometimes Wright puts titles in inverted commas, so we need to use this. See MS/25). sic supplied

davidamichelson commented 1 month ago

Remove these elements as child placeName :

davidamichelson commented 1 month ago

Name the branch Rich-Markup-Nested-URIs-Do-Not-Delete

This branch preserves rich markup inside the following elements: bibl, persName, placeName, ref, and title.

davidamichelson commented 1 month ago

Strip out child elements from these in this order:

Remove children of placeName Remove children of persName Remove children of ref Remove children of title

davidamichelson commented 1 month ago

Don't strip anything out of bibl at this time.

dlschwartz commented 1 month ago

Commit allows choice, foreign, locus, quote, sic, and supplied as children of title.

dlschwartz commented 1 month ago

Update on this issue. Here is a list of descendants of the relevant elements:

This is a pretty small number of elements. I've also searched for the grandchildren of each, i.e. //title/placeName/child::node()/child::node()/node-name(). There are very few of these. And there is only on great-grandchild.

@davidamichelson when you are ready for me to make these changes I will start with the great-grandchild, then the grandchildren, and finally the children. This should be relatively quick work now that I've cleaned the use of choice in title.

dlschwartz commented 1 month ago

I have removed the unnecessary date, persName, placeName, title, and ref elements/attributes from the title element.

The scribal persName and placeName that might have data we want to keep is still in record 413.

The bibl child of title is still not validating because we still need to review this data.

The relevant commits are: https://github.com/srophe/britishLibrary-data/commit/7a1390f50e8226b9eca7198424a351f6e46c3605 https://github.com/srophe/britishLibrary-data/commit/dc1db12a7556d0e0c9a7f141c753f6046cc41526 https://github.com/srophe/britishLibrary-data/commit/9c4cbd695e86b80a472aeff535dc3edd1de28fdd https://github.com/srophe/britishLibrary-data/commit/c8b95dbe35f958a70510ca5a3e870b665f807a75