First move all patched files into proc_data/ to mirror the structure of the data folder on main
Create new primary keys according to new id scheme.
[x] Replace old id secondary keys with new ones.
[x] Delete all NULL strings.
[x] primary sort by ID and secondary by lang
Act_p
[x] extend structural id_lang to lit entries
[x] check site_information for lit entries
ArtWork_p
[x] create primary keys for new art works in work.csv
[x] replace old ID on ArtWork.csv with new ones from patched work.csv
[x] migrate comments from Work to ArtWork
Institution_p
[x] Delete I0004, I0006, and I0007 additions from patched output.
[x] refactor the notes "fictional" this should be captured on Agent.csv only, double check whats going on there.
Membership_p
[x] Membership.institution, Memebership.member need agent ID
[x] Membership.source needs work ID
Space_p
[x] more careful handling of NULL entries necessary, also check for unknown place/location id
[x] resolve ID conflicts in P5
[x] re-confirm before deleting P4
Agent_p
[x] delete duplicate old_id before proceeding with regular cleanup steps
[x] cleanup (delete) Agent.name_lang column
[x] ensure that Person_lit.ficionality column data is not lost in new system (need to decide where to put it)
NarrativePosition_p
[x] update csv schema and data-dictionary for narrative position table.
Primary- / SecondarySource_p
[x] sort first lots of dubious entries
[x] Check PS00207 which should be unknown work?
[x] move source.fictionality to work.fictionality on main entity?
[x] merge three (?) work.csv tables
[x] *Source.genre and Work.type_num need check for refactoring seems superfluous to repeat genres on Sources when we could add them to Work, check ArtWorks.
[x] fix creator references on new Work entries to point to Agents instead of Persons
[x] SocialRelation.ego and SocialRelation.related needs agent IDs
[x] SocialRelation.source needs work IDs
Work_p
[x] cleanup work.type_num
[x] remove false dupes
[x] AW
[x] PS (see notes)
[x] SS
Merge Primary Entities
[x] person + instition -> agent
[x] primarysource + secondarysource + artwork -> work
[x] place(?) + location -> space
Please indicate issues related to this PR (if any)
See #365 #415 #427 #429 #432 #449 #452 #477
Close #366
close #371
close #414
close #458
close #471
close #480
close #489
close #486
close #500
close #509
close #511
close #512
close #514
close #516
close #520
close #521
Cleanup
First move all patched files into
proc_data/
to mirror the structure of the data folder onmain
Create new primary keys according to new
id
scheme.id
secondary keys with new ones.NULL
strings.Act_p
id_lang
tolit
entriessite_information
forlit
entriesArtWork_p
work.csv
ArtWork.csv
with new ones from patchedwork.csv
Work
toArtWork
Institution_p
I0004
,I0006
, andI0007
additions from patched output.Agent.csv
only, double check whats going on there.Membership_p
Membership.institution
,Memebership.member
need agent IDMembership.source
needs work IDSpace_p
NULL
entries necessary, also check for unknown place/location idAgent_p
Agent.name_lang
columnPerson_lit.ficionality
column data is not lost in new system (need to decide where to put it)NarrativePosition_p
Primary- / SecondarySource_p
PS00207
which should be unknown work?source.fictionality
towork.fictionality
on main entity?work.csv
tables*Source.genre
andWork.type_num
need check for refactoring seems superfluous to repeat genres on Sources when we could add them to Work, check ArtWorks.Quotation_p
Quotation.source
needs work IDsSocialRelation
SocialRelation.ego
andSocialRelation.related
needs agent IDsSocialRelation.source
needs work IDsWork_p
work.type_num
Merge Primary Entities
person
+instition
->agent
primarysource
+secondarysource
+artwork
->work
place
(?) +location
->space
Please indicate issues related to this PR (if any)
See #365 #415 #427 #429 #432 #449 #452 #477 Close #366 close #371 close #414 close #458 close #471 close #480 close #489 close #486 close #500 close #509 close #511 close #512 close #514 close #516 close #520 close #521