Open flooie opened 15 hours ago
@grossir @quevon24
If either of you has suggestions on this I would be happy to hear them.
So far - after a cursory look at a couple hundred similar opinions from Public Resource I am not seeing other opinions like this one.
I think we need more examples to better understand/unravel what happened here.
Importing the PRO content was difficult and I had some tricky solutions:
Some didn't have dates and only would say something like "Spring 1854," so I had to just do an estimate for these (we have a field in the DB that says something like date_is_estimated
or something).
I don't think this factored into the case we're looking at.
I used date parser to pull dates out of cases. It works quite well, but can sometimes find something that looks like a date, like a docket number, say, and interpret that as a date.
I checked the case we're looking at and the date it has doesn't look like it could have come from a bad parse, so this theory doesn't make sense either.
Some cases couldn't be parsed for dates with the skills I had at the time, so I had a script I ran for months in my spare time. It would pop up a case in my browser and would allow me to input the date (or choose from several it found). I did this for about 100k cases, I think. It took months, but, well, it got the job done?
This could have been what happened here, but I'm doubtful of this too because the date looks pretty easy to parse from the text.
Is any of this helpful? Probably not, but it's history worth sharing, I think.
What to do now? An audit makes sense to me. We're skilled enough to make a very simple parser of the first 500 characters of the HTML, of example, and see if the date found in it lines up — or there are probably another dozen ways to check this, so I'll duck out. But, yeah, let's get on this.
One other note: This case was imported in 2011, so it's one of the oldest we have, a fact you can see from its ID being lower than 200k.
@mlissner - I dont mean to impugn your import it is a difficult data set and could be something else - but that Is my best guess
Also Oct. 30, 2015, 2:57 p.m.
- the date created is 2015... any idea why ... was there a new import some big database switch?
The opinion was 2011, the cluster was 2014, and the docket was 2015. Guessing these correspond to the creation of those objects, but I honestly don't recall.
Date Discrepancy in an Opinion Cluster (maybe more)
Description:
A user reported that certain dates in our opinion clusters appear to be off by a day. Upon investigation, this discrepancy is confirmed. All three original data sources appear to have the correct date - so this is a curious error.
Investigation Notes:
The fact that it is one day off suggests to me an error in some import or merge - likely around timezones although im not sure how that would actually happen.
Actual Changes found:
The following changes were recorded in pghistory, though they do not align directly with the date discrepancy:
No specific alterations to filing dates appear in the tracked snapshots.
Next Steps: