When querying for orphaned digital objects via API and via DB, respectively, we got slightly divergent counts (see #604):
54,335 via API
54,114 via DB
Most of the discrepancy is explained by inadvertent duplicates in the API result set. However, the following divergent results remain:
returned via API, but not via DB:
00674e10-0408-456c-a385-675ca7124ae1
4d6d5732-1b33-4713-b033-19f2ae8d0807
84d62080-3bd2-457b-9fac-189f8b7fd57b
e052db26-db8b-4016-ba65-1436ca60237a
f2f296c3-8917-4446-ac03-dbe86995813b
These are false positives returned because the collection array returns empty even though they are linked to an ao. I'm not sure of the mechanism of that, but I suspect it is an artifact of the sequel_to_json mediation layer in ArchivesSpace.
returned via DB, but not via API:
32ccc343-9d7a-48f3-a798-a86aee61e0b3
ark:/88435/8g84mw89p
These are true positives in that they have no file version records. However, the ark, which currently populates the identifier field, resolves, so this record can be corrected and removed from the data set.
After accounting for those discrepancies, both queries return 54,115 results that may be considered true orphans.
When querying for orphaned digital objects via API and via DB, respectively, we got slightly divergent counts (see #604):
Most of the discrepancy is explained by inadvertent duplicates in the API result set. However, the following divergent results remain:
These are false positives returned because the
collection
array returns empty even though they are linked to an ao. I'm not sure of the mechanism of that, but I suspect it is an artifact of thesequel_to_json
mediation layer in ArchivesSpace.These are true positives in that they have no file version records. However, the ark, which currently populates the
identifier
field, resolves, so this record can be corrected and removed from the data set.After accounting for those discrepancies, both queries return 54,115 results that may be considered true orphans.