eyeonus / EDDBlink

(No longer maintained) A plugin for TradeDangerous to update market data using EDDB's api files.
GNU Lesser General Public License v3.0
2 stars 0 forks source link

DateTime parsing fails due to locales (ubuntu linux 16.04) #1

Closed JaggeV closed 6 years ago

JaggeV commented 6 years ago

The plugin seems to work fine, except that with finnish locales, the datetime parsing fails for the data in this line: timegm(datetime.datetime.strptime( \ urllib.request.urlopen(url).getheader("Last-Modified"),\ "%a, %d %b %Y %X GMT").timetuple()) problem is the strptime method which is locale dependant. Fix would be to either use some other parser (dateutil perhaps?) or temporarily set locale to some other which can understand the time format given in the headers. Problem with locale approach is that if user does not have the locale installed, then failure will occur. Also the .prices file has disappeared when strptime() throws, dunno if that is a bad thing, since no data is processed yet and still .prices file is gone.

Also as a side note, the parsing of ship vendors take quite a bit of time (at least on this poor virtual machine)

NOTE: Processing Stations, this may take a bit: Start time = 2018-05-02 14:59:04.922691 NOTE: Simultaneously processing ShipVendors. NOTE: Simultaneously processing UpgradeVendors, this will take quite a while. Progress: (60644/67840) 89.39%
Current time is 2018-05-03 10:19:xx.xx so 24h mark is rapidly approaching :) I don't want to repeat -O clean anytime soon.

Otherwise things are looking good, keep up the good work

eyeonus commented 6 years ago

Yeah, it's actually processing the UpgradeVendors that takes all the time. ShipVendors's time isn't anything to sneeze at, but Ho Chih Min UpgradeVendors is completely AWFUL.

I recommend always doing 'skipvend' if you're not doing 'clean', and it's not a bad idea even then.

The lack of a prices file has nothing to do with the error. IF this is your first run of the plugin, or a clean run, the plugin completely deletes the database, which includes all the csv files and the prices file, before doing the import. That's intended.

It's trying to get the timestamp from a file on the web. I did not know that was locale dependent. I'll look into it tomorrow after I've had some sleep.

eyeonus commented 6 years ago

Somebody on the TD forum said changing that line to read "%a, %d %b %Y %H:%M:%S GMT" fixed the issue for that person, would you see if it does for you as well?

When I have time I'll look into using something besides strptime, but I'm hoping that'll work for you.

JaggeV commented 6 years ago

Hi, that does not work, problem being still the locale and the fact that the date string contains localized dates, namely date name (Mon, Tue, ...) and month name (Jan, Feb...) and those will be tried to parse using local setting, in my case finnish, which expects date to be (Ma, Ti...), and month names (Tam, Hel...). Similar problems will arise with different locales, germany, swedish etc...

https://stackoverflow.com/questions/18593661/how-do-i-strftime-a-date-object-in-a-different-locale gives a way to define locale properly in the plugin, but it seems that generally speaking changing locale in program is bad practice. Lib babel is given as an alternative way to parse time strings.

aadler commented 6 years ago

Well, per Mozilla, the format of Last-Modified is:

<day-name>, <day> <month> <year> <hour>:<minute>:<second> GMT:

<day-name>
    One of "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", or "Sun" (case-sensitive).
<day>
    2 digit day number, e.g. "04" or "23".
<month>
    One of "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" (case sensitive).
<year>
    4 digit year number, e.g. "1990" or "2016".
<hour>
    2 digit hour number, e.g. "09" or "23".
<minute>
    2 digit minute number, e.g. "04" or "59".
<second>
    2 digit second number, e.g. "04" or "59".
GMT

    Greenwich Mean Time. HTTP dates are always expressed in GMT, never in local time.

Which should correspond to: "%a, %d %m %Y %H:%M:%S GMT"

which is what Owyhee suggested on the forum. I haven't tried that yet; hopefully tonight.

The bigger question is whether or not EDDB places something like an Etag on he files. If the file content/date is hashed then all you would have to do is add "current hash" to the table and only download/update if local hash <> remote hash.

aadler commented 6 years ago

@JaggeV If we could convert both localModded and dumpModded to formats which only require numbers that would help, although month-day ordering would still be an issue.

An even better option would be to convert both to a seconds (or 10 second window) since epoch date, which should be agnostic to locale.

eyeonus commented 6 years ago

@addler That's what I'm doing with the line that's causing the error- it converts the time to a Unix-epoch number.

aadler commented 6 years ago

But it isn’t matching the last modified format which requires HH:MM:SS GMT, it seems.

On Thu, May 3, 2018 at 5:47 PM Jonathan Jones notifications@github.com wrote:

@addler https://github.com/addler That's what I'm doing with the line that's causing the error- it converts the time to a Unix-epoch number.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eyeonus/EDDBlink/issues/1#issuecomment-386447678, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVk8SHgvN0MbJkfZTeg9v93W0zmHffdks5tu3sOgaJpZM4Twmns .

-- Sent from Gmail Mobile

eyeonus commented 6 years ago

%X The time, using the locale's time format.

http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html

I literally copied the format from TD. I'm surprised no one had a problem before this plugin.

aadler commented 6 years ago

Unless you’re locale is GMT, wouldn’t that throw an error?

On Thu, May 3, 2018 at 5:55 PM Jonathan Jones notifications@github.com wrote:

%X The time, using the locale's time format.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eyeonus/EDDBlink/issues/1#issuecomment-386449243, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVk8f06W5sWpVJcM9Ow4jm6JMasLSvHks5tu3y4gaJpZM4Twmns .

-- Sent from Gmail Mobile

eyeonus commented 6 years ago

Nope. GMT isn't a locale, it's a timezone. en_US is a locale.

eyeonus commented 6 years ago

I think I'm going to have to write my own string parser. I don't want to use dateutil because that requires the users to download something.

Tromador commented 6 years ago

This is supposed to be a power users tool. It's command line, it's complicated and it doesn't "just work" without some technical ability.

I'm not sure why you can't include "pip install blah" as part of the install instructions with a reasonable expectation that users can deal with that.

As soon as you say the dreaded word timezone, I immediately think of this and recommend using the library someone else built.