zverok / wheretz

Fast and precise time zone by geo coordinates lookup
MIT License
99 stars 12 forks source link

Proposal: region simplification for data size reduction #3

Open zverok opened 8 years ago

zverok commented 8 years ago

This is proposal awaiting for someones ideas/verification.

The idea:

So, if somebody could look at simplify branch and provide some opinions, it would be really cool.

For example, some amount of simplification can be seen at here -- it is world map simplification factor 0.1 -- you can experiment with factor at script/simplify.rb.

jotolo commented 5 years ago

@zverok What happened with the simplify branch? It's good to go, right?

bf4 commented 4 years ago

Another idea is to break the gem into a meta-gem that has all the logic and let's you specify via other ways which data to release and package. like maybe only pull in the 'data' dir if there's no file in some known location. or tar gz each file in the data folder when packaging to rubygems and only untar a given file when needed etc?

zverok commented 3 years ago

@jotolo (Sorry for the late response... Well, like, almost 2 years late, I am not sure you are still interested) It turns out that, say, for Europe (lot of complicated borders) simplified data is absolutely unacceptable, as it might miss large cities and even whole countries.

@bf4 Actually, it seems for me that some much more effective encoding (than just dumb JSON) is possible and necessary for any serious usage, but it seems almost nobody uses the gem, so... I am not actually have much incentive to work on its optimization.

trevorturk commented 3 years ago

@zverok I've been using the gem happily in the past, but went to geonames.org API to keep my memory use on Heroku down. I came back to look at the options and I think yours is still a good option -- thank you! I noticed no Ruby option is listed here: https://github.com/evansiroky/timezone-boundary-builder#lookup-libraries -- do you think your gem is still reasonable to use, or perhaps there's a different recommended way now? If you're interested please drop me a line and I'd be happy to sponsor some development as I did have good luck using your gem in the past!

zverok commented 3 years ago

@trevorturk Honestly, I don't know a proper answer to this. I am trying to maintain the library (it is simple enough for it), and, like, today I pushed the 0.0.6 (updating data to 2020d... released last November), and it works properly under all supported Ruby versions.

Other than thatโ€” I don't know. The task of making it more effective (both by performance and memory) is interesting enough, but I have a long list of projects I am interested in, and typically I work on projects/ideas related to each other. wheretz was created during the work on reality, which is now dormant (because I became disenchanted in the idea of a set of open real-world data libraries in Ruby).

So, as you might see, it doesn't come down to a sponsorship (I have a pretty well-paying job, and all of my side projects are just for the fun/the cause I believe at the time).

Honestly, if somebody finds it useful and this is still the best option for Ruby, it could be a cause good enough to trying to make it better... But I am not sure how much resource it will require.

trevorturk commented 3 years ago

No worries at all, everything makes sense. Do you mind if I submit a PR to add yours to the list here? https://github.com/evansiroky/timezone-boundary-builder#lookup-libraries -- I just wasn't sure if you decided not to maintain the library anymore. In any case, thank you, it's been a great resource! ๐Ÿ˜Ž๐Ÿ‘

terryyin commented 3 years ago

BTW, since 0.0.6 it became much slower.

zverok commented 3 years ago

@terryyin I'll try to look into it on the weekend. But code haven't changed at all, only data updated, so it could be something to do with data quality (more detailed borders with more points). If you can provide some particular request examples that became slower, it would be very helpful.