Open johnhawkinson opened 6 years ago
I just realized this is in your repo not one that I do PR reviews for. Did you actually want me to review this PR or did I just come up with that in my head?
Did you actually want me to review this PR or did I just come up with that in my head?
Well, I said:
Thoughts on that or other aspects welcome. Esp. @mlissner.
So yes, I sought your review :)
Seems like we'll need the full lookup table fairly soon for RSS parsing, and that it's the right way to do this. It probably isn't terribly hard to do...just have to think through 200 courts, I guess. We could split it up if you want?
You define the object you want to make — a dict? — and tell me which ones you want me to do, and I'll crank through 'em.
just have to think through 200 courts, I guess. We could split it up if you want?
That's a bit silly. There are 204 ECF courts at https://www.pacer.gov/psco/cgi-bin/links.pl. With a few exceptions (appeals courts and cit
, cofc
, jpml
), everything's a district of bankruptcy court with a State/Territory/District of Columbia identifier in the first few characters (the only non-state abbrevs: dc
, gu
, nmi
, pr
, vi
). And with the exception of the Northern Marianas Islands (which is not New Mexico) [check me on this, please], the first 2 characters are determinative.
So this just reduces to the mapping of timezones by state. Unless any of the courts in question fall in any of the hokey nonstandard cities within a state (do they? I don't know of any...).
Curiously at least 1 court publishes its timezone information in the CourtInfo.pl page, e.g.:
<Locations>
<name>USDC Northern Indiana</name>
<address>1300 South Harrison Street, Room 1108, Fort Wayne, IN 46802</address>
<email>fwclerks&#064;innd.uscourts.gov</email>
<hours>9:00 a.m. to 4:00 p.m. Eastern time</hours>
<phone>260-423-3000 or 800-745-0265</phone>
</Locations>
(God knows what happens if different divisions of a court have offices in different timezones.)
Edit: Err, crap. Like, in fact, this one:
field | Court Locations |
---|---|
Court's Name | USDC Northern Indiana |
Court's Address | 1300 South Harrison Street, Room 1108, Fort Wayne, IN 46802 |
Court's Phone Number | 260-423-3000 or 800-745-0265 |
Court's Email Address | fwclerks@innd.uscourts.gov |
Court's Hours | 9:00 a.m. to 4:00 p.m. Eastern time |
Court's Name | USDC Northern Indiana |
Court's Address | 5400 Federal Plaza, Suite 2300, Hammond, IN 46320 |
Court's Phone Number | 219-852-6500 or 800-473-0293 |
Court's Email Address | hmdclerks@innd.uscourts.gov |
Court's Hours | 9:00 a.m. to 4:00 p.m. Central time |
Research question!
Anyhow, getting back to the initial question:
The more I think about it, the more I am skeptical this is the right way to go and we should just bite the bullet and do the table lookup. Although that doesn't avoid the timezone code ugliness.
I read your answer as to say it's silly to apply this workaround and let's fix it "right" (with a table). I'm not 100% convinced, but I'm leaning in that direction.
I dunno if I should close this PR and open a new one for the table-based fix. But I was hoping for feedback on the tzset
/time.localtime(calendar.timegm())
/tzset
dance.
Here's a lookup function (which is not exactly the same as a table): https://github.com/johnhawkinson/ecftimezone
I'm not really confident this was the right architecture. It seemed like a good idea at the time.
Don't report the date to CourtListener in UTC, convert to the local zone. Addresses #4.
Fortunately (today) we only care about dates, not times, so this isn't so serious. [But we should care about times!]
Unfortunately recapupload doesn't know what timezone each court is in, so converting to local time is hard. But as a compromise, convert to US/Pacific, which is less wrong (and a 10pm Pacific filing is much more likely than a 1am Eastern filing).
Also, unfortunately, the only way to control the local zone is to set the TZ environment variable and call tzset(). And then we have to put it back in case there are others in the same process who use time localization.
The more I think about it, the more I am skeptical this is the right way to go and we should just bite the bullet and do the table lookup. Although that doesn't avoid the timezone code ugliness.
Thoughts on that or other aspects welcome. Esp. @mlissner.