tomayac / local-reverse-geocoder

Local reverse geocoder for Node.js based on GeoNames data
Apache License 2.0
190 stars 58 forks source link

GeoNames Dump (auto-)update issues #24

Open tavinus opened 6 years ago

tavinus commented 6 years ago

Hi there! Thanks for this app!
I am trying to understand how the Geonames dump config works.

You say:

By default, the local GeoNames dump data gets refreshed each day. You can override this behavior by removing the timestamp from the files in the ./geonames_dump download folder.

But that does not seem to apply.
I say that because I didn't remove the timestamp from the files in geonames_dump, but they also do not seem to have received any updates in the past 2 months.

This is the geonames_dump structure (I am disabling some admin-codes on init)

$ tree geonames_dump/
geonames_dump/
|-- admin1_codes
|   `-- admin1CodesASCII_2018-06-08.txt
`-- cities
    `-- cities1000_2018-06-08.txt

The files and folders on the dump directory also have the same date/time of 2018-06-08 as their names imply.

So my main questions are:

  1. Is the described functionality of auto-update disabled or broken?
  2. Am I right to suppose it is not updating?
  3. How would I manually update the dump? (just move/remove the geonames_dump folder and run?)
  4. Is there any way to config the updates? (eg. make it run monthly)

I am a bit new to node.js and could not really find many hints on the code at this point.
It would be nice to have a way to initialize a new dump while still using the old one as well (no downtime on the service). Should be easy to just do it all on a temporary folder and then move it when done, right?

Edit: I just would like to make sure it does not update everyday, but also make it update each 3 months (4 times per year seems more than fine for city limits, having a force-update feature would also be usefull).

Cheers!
Gus

tavinus commented 6 years ago

Hi again.
I was able to test now during the night and can already answer some of the questions.

I have a system service running for this, so first I stopped the service.
Then I moved the current geonames_dump to geonames_dump_old.
Then I restarted the service and all was rebuilt.

It was also quite fast, but maybe that is because I only use 2 fields.
The new files have today's date:

$ tree geonames_dump
geonames_dump
|-- admin1_codes
|   `-- admin1CodesASCII_2018-08-02.txt
`-- cities
    `-- cities1000_2018-08-02.txt

So it seems to be an update and also seems that auto-update was indeed not working.
It is an option to force updates, but would be better to have something more elaborate.
It is also still a bit confusing if the automatic updates are running or not (and when).

There is no command/script to perform an update right?

Cheers! Gus

tomayac commented 6 years ago

Hi Gus,

The updates currently only happen when you restart the server, but not when the server is kept running. I agree this is not clear by reading the docs.

There might be a way to make zero-downtime updates happen, but it would probably require a bit of restructuring work. I am more than happy to merge Pull Requests if you have the bandwidth to make this happen, but unfortunately can't work on this myself at the moment.

Cheers, Tom

tavinus commented 6 years ago

I see.
Is it when I restart the machine or the geoname service that the dump is updated? (init?)
How is that checked on the code? (I tried to find it without much success).
I found some checks for files with the timestamps but could not nail how that works yet.

I am not used to node.js, even though Javascript is quite easy for me.
I had some trouble when trying to read the code, not sure I can do it.
Main problem for me at this point would be to understand the code properly.

Anyways.. thanks for the response!
Cheers!
Gus

tomayac commented 6 years ago

Is it when I restart the machine or the geoname service that the dump is updated?

When you restart the Node.js server the library checks if it needs to update the data.

The check is currently pretty simple and not flexibly implemented: each day a new file name suffix is being used, ((new Date()).toISOString().substr(0, 10) which returns, e.g., "2018-08-09") and a check if the file name exists is run. You can see this in action in various locations of the code. Hope this helps you understand the logic.

If I were to implement this today, I would make it flexible, i.e., check if the data is older than a threshold I could (per file) individually set. Sorry for not properly implementing it in the first place… 🙁

Garamani commented 2 years ago

Greetings,

This is local_reverse_geocoder debug info:

debug_info

This is the image to see the files in the server:

files_in_server

the cities15000.txt file is available with no timestamp, but It seems the cities15000 gets downloaded when the server is started.

tomayac commented 2 years ago

For https://github.com/tomayac/local-reverse-geocoder/issues/24#issuecomment-1251062343, I have just published v0.12.2 that makes sure the overridden cities name is reflected properly.

Garamani commented 2 years ago

@tomayac Thank you I updated to the latest version, but the problem for me was the name of the "cities" folder. In the latest local_reverse_geocoder codes, the "cities" folder name should be "cities15000" or "cities1000" in the geonames_dump folder.

Now it works. the geonames_dump/cities15000/cities15000.txt exists and the local_reverse_geocoder doesn't try to download/replace it.

Thanks again for this great package 🙏