johnhawkinson / recapupload

Upload documents to RECAP
1 stars 1 forks source link

recapupload: date tz workaround #5

Open johnhawkinson opened 6 years ago

johnhawkinson commented 6 years ago

Don't report the date to CourtListener in UTC, convert to the local zone. Addresses #4.

Fortunately (today) we only care about dates, not times, so this isn't so serious. [But we should care about times!]

Unfortunately recapupload doesn't know what timezone each court is in, so converting to local time is hard. But as a compromise, convert to US/Pacific, which is less wrong (and a 10pm Pacific filing is much more likely than a 1am Eastern filing).

Also, unfortunately, the only way to control the local zone is to set the TZ environment variable and call tzset(). And then we have to put it back in case there are others in the same process who use time localization.


The more I think about it, the more I am skeptical this is the right way to go and we should just bite the bullet and do the table lookup. Although that doesn't avoid the timezone code ugliness.

Thoughts on that or other aspects welcome. Esp. @mlissner.

mlissner commented 6 years ago

I just realized this is in your repo not one that I do PR reviews for. Did you actually want me to review this PR or did I just come up with that in my head?

johnhawkinson commented 6 years ago

Did you actually want me to review this PR or did I just come up with that in my head?

Well, I said:

Thoughts on that or other aspects welcome. Esp. @mlissner.

So yes, I sought your review :)

mlissner commented 6 years ago

Seems like we'll need the full lookup table fairly soon for RSS parsing, and that it's the right way to do this. It probably isn't terribly hard to do...just have to think through 200 courts, I guess. We could split it up if you want?

mlissner commented 6 years ago

You define the object you want to make — a dict? — and tell me which ones you want me to do, and I'll crank through 'em.

johnhawkinson commented 6 years ago

just have to think through 200 courts, I guess. We could split it up if you want?

That's a bit silly. There are 204 ECF courts at https://www.pacer.gov/psco/cgi-bin/links.pl. With a few exceptions (appeals courts and cit, cofc, jpml), everything's a district of bankruptcy court with a State/Territory/District of Columbia identifier in the first few characters (the only non-state abbrevs: dc, gu, nmi, pr, vi). And with the exception of the Northern Marianas Islands (which is not New Mexico) [check me on this, please], the first 2 characters are determinative.

So this just reduces to the mapping of timezones by state. Unless any of the courts in question fall in any of the hokey nonstandard cities within a state (do they? I don't know of any...).

Curiously at least 1 court publishes its timezone information in the CourtInfo.pl page, e.g.:

  <Locations>
    <name>USDC Northern Indiana</name>
    <address>1300 South Harrison Street, Room 1108, Fort Wayne, IN  46802</address>
    <email>fwclerks&amp;#064;innd.uscourts.gov</email>
    <hours>9:00 a.m. to 4:00 p.m. Eastern time</hours>
    <phone>260-423-3000 or 800-745-0265</phone>
  </Locations>

(God knows what happens if different divisions of a court have offices in different timezones.)

Edit: Err, crap. Like, in fact, this one:

field Court Locations
Court's Name USDC Northern Indiana
Court's Address 1300 South Harrison Street, Room 1108, Fort Wayne, IN 46802
Court's Phone Number 260-423-3000 or 800-745-0265
Court's Email Address fwclerks@innd.uscourts.gov
Court's Hours 9:00 a.m. to 4:00 p.m. Eastern time
 
Court's Name USDC Northern Indiana
Court's Address 5400 Federal Plaza, Suite 2300, Hammond, IN 46320
Court's Phone Number 219-852-6500 or 800-473-0293
Court's Email Address hmdclerks@innd.uscourts.gov
Court's Hours 9:00 a.m. to 4:00 p.m. Central time

Research question!


Anyhow, getting back to the initial question:

The more I think about it, the more I am skeptical this is the right way to go and we should just bite the bullet and do the table lookup. Although that doesn't avoid the timezone code ugliness.

I read your answer as to say it's silly to apply this workaround and let's fix it "right" (with a table). I'm not 100% convinced, but I'm leaning in that direction.

I dunno if I should close this PR and open a new one for the table-based fix. But I was hoping for feedback on the tzset/time.localtime(calendar.timegm())/tzset dance.

johnhawkinson commented 6 years ago

Here's a lookup function (which is not exactly the same as a table): https://github.com/johnhawkinson/ecftimezone

I'm not really confident this was the right architecture. It seemed like a good idea at the time.