A recent check of publication uniqueness suggests there are 76 newspaper publication_code duplicates (all just 1 other record, so a count of 2).
[ ] 76 same publication_code records
[ ] 82520 same issue_code records
[ ] 3670454 same item_code records
These might be cases of multiple editions of issue on the same day (following @kmcdono2 in #120), or actual duplicate records (meaning... just wrong). I think the majority of the publication_code cases are the later (and thankfully quite a few have no related issues, and by extension items):
>>> from django.db.models import QuerySet
>>> from newspaper.models import Newspaper, Issue, Item
>>> from lwmdb.utils import similar_records
>>> newspaper_same_codes: QuerySet = similar_records(Newspaper.objects.all(), check_fields=('publication_code',))
>>> issue_same_codes: QuerySet = similar_records(Issue.objects.all(), check_fields=('issue_code',))
>>> item_same_codes: QuerySet = similar_records(Item.objects.all(), check_fields=('item_code',))
>>> len(newspaper_same_codes)
76
>>> len(issue_same_codes)
81520
>>> len(item_same_codes)
3670454
>>> all(record for record in newspaper_same_codes if record['id__count'] == 2)
True
>>> all(record for record in issue_same_codes if record['id__count'] == 2)
True
>>> all(record for record in item_same_codes if record['id__count'] == 2)
True
A recent check of publication uniqueness suggests there are 76 newspaper
publication_code
duplicates (all just 1 other record, so a count of 2).publication_code
recordsissue_code
recordsitem_code
recordsThese might be cases of multiple editions of
issue
on the same day (following @kmcdono2 in #120), or actual duplicate records (meaning... just wrong). I think the majority of thepublication_code
cases are the later (and thankfully quite a few have no relatedissues
, and by extensionitems
):