cwrc / RDF-extraction

0 stars 0 forks source link

Update object of oa:hasSource triple #39

Open alliyya opened 2 years ago

alliyya commented 2 years ago

Previously:

_:Ndb84f98f44524844ae3ca077014b7f7e rdfs:label "Eliza Parsons - Response Context excerpt" ;
    cito:cites <https://commons.cwrc.ca/orlando:f0992867-3835-46c0-839a-e3c46ef69c8f_dbref> ;
    oa:hasSelector _:N1cbfa1dc18cd4f8e86f65d8e78d6c9d4 ;
    oa:hasSource <http://orlando.cambridge.org/protected/svPeople?formname=r&people_tab=3&person_id=parsel#TheConvict> .

New URL should be https://orlando.cambridge.org/profiles/parsel#parsel-subchapter-theconvict

Structure: https://orlando.cambridge.org/profiles/{author-id}#{author}-{chapter-type}-{text from heading concatenated & lower cased}

The chapter-type seems to vary for levels of headings:

I think this is the general pattern, will have to do further testing to see if there's more than that.

Tasks: