Princeton-CDH / cdh-web

the CDH website
https://cdh.princeton.edu
Apache License 2.0
1 stars 5 forks source link

reporting and investigating related to cdh web 4.0 content and functionality #443

Open rlskoeser opened 1 month ago

rlskoeser commented 1 month ago
rlskoeser commented 1 month ago

Couldn't figure out how to identify non-empty attachment blocks purely by query, so I wrote a little script to generate a report. Saving the script as a private gist in case we need to reference/re-run.

Going to start adding reported data as tabs in a google spreadsheet

rlskoeser commented 1 month ago

Used django queryset to identify duplicate Person objects (only checking last name, possibly collapsing); added report to spreadsheet.


from cdhweb.people.models import Person
from django.db.models import Count
import csv
outfile = open('duplicate_persons.csv', 'w')
c = csv.DictWriter(outfile, fieldnames=['first_name', 'last_name', 'count'])
c.writeheader()
for p in Person.objects.order_by('last_name').values('first_name', 'last_name').annotate(count=Count('last_name')).filter(count__gt=1):
    c.writerow(p)

outfile.close()```
rlskoeser commented 1 month ago

Adapted my attachment check to look for pages with the old 'migrated' content block; saved script as a github gist; added another tab to the spreadsheet