freelawproject / courtlistener

A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial financial records, and federal filings.
https://www.courtlistener.com
Other
550 stars 151 forks source link

feat(harvard_merger): add `harvard_id` field to `OpinionCluster` #4622

Open cweider opened 3 weeks ago

cweider commented 3 weeks ago

Harvard's Caselaw Access Project has been sunset. For projects which have existing references to CAP cases, there's a need to identify a CAP case's corresponding CL opinion cluster.

An indexed harvard_id column is added to OpinionCluster. The field is also added to the fields of OpinionClusterFilter.

For migration, this patch builds on work done in #4284 and #4442 and extends import_harvard_pdfs to populate the harvard_id column using CAP crosswalk file.

Fixes: #4313

quevon24 commented 3 weeks ago

@cweider The import_harvard_pdfs command has been updated, you may have conflicts in the PR