When we find that any case has been updated since the last scrape, the script to update rescraped cases is inserting the wrong data for unchanged cases.
This section is causing the problem:
UPDATE cases.court_case
SET
calendar = r.calendar,
filing_date = r.filing_date,
division = r.division,
case_type = r.case_type,
ad_damnum = r.ad_damnum,
court = r.court,
hash = r.hash,
scraped_at = CURRENT_TIMESTAMP,
updated_at = CURRENT_TIMESTAMP
FROM court_case as r
WHERE
court_case.case_number IN (SELECT * FROM updated_case);
This results in court_case looking like this:
After the fix is in, we'll have to do a major manual rescrape of civil and chancery cases to fix the errors.
When we find that any case has been updated since the last scrape, the script to update rescraped cases is inserting the wrong data for unchanged cases.
This section is causing the problem:
This results in
court_case
looking like this:After the fix is in, we'll have to do a major manual rescrape of civil and chancery cases to fix the errors.