openbmap / radiocells-scanner-android

WLAN and cell tower scanner for Radiocells.org
https://www.radiocells.org
Other
55 stars 26 forks source link

Re-add download of full cell/wifi catalog #170

Open mvglasow opened 7 years ago

mvglasow commented 7 years ago

What steps will reproduce the problem? Go to the web site, navigate to Downloads > Cell and WiFi data and choose Worldwide dataset.

What is the expected output? What do you see instead? I'd expect the full cell/wifi catalog as it used to be. Instead, I get a 404 Not Found.

What version are you using? On what operating system? Does not apply.

Please provide any additional information below. Similar issues arise when trying to download the wifi catalog from Radiobeacon or the NLP backend – only individual countries are selectable but there's no way to download the full set.

I understand that this change was probably introduced because of bandwidth issues caused by many users downloading the whole thing over and over again, but limiting downloads to one country comes with other side effects. For example, I need to download multiple catalogs and switch manually whenever I cross a border.

I see three possible approaches to tackle this:

1. Simply restore the option to download the full catalog What probably contributed to the bandwidth issues was that originally one needed to download the whole catalog or nothing at all, because nothing else was available. Now that users can limit their downloads to particular areas of interest, this may be less of an issue.

2. Host the catalogs on a public cloud service Maybe Google Drive, maybe even Github, if their terms of service permit that. Anything that permits sharing (somewhat) arbitrary files over 1 GB in size and allows downloads by anonymous users would work.

3. Implement differential updates Abandon the idea of the catalog being a file of which clients periodically download a new version. Rather, start with a blank database on the client into which the client loads the data it needs. (This can still be in the form of .sqlite databases, of course.) Downloading the whole set of data is only necessary the first time. After that, clients/users just fetch those records which were added or changed since the last update. And since the catalog already has a "last updated" timestamp column, clients have all the information they need to merge two databases – it would just need to be implemented.

On the server side, this could be implemented in the following way:

With client-side support, databases (full and diffs) could then be imported in any order. It comes with some coding but is by far the most elegant solution – and the most efficient one in terms of bandwidth.

wish7code commented 7 years ago

Hey Michael,

thanks for reporting.. Quickfix: 404 was due to an typo only, wordwide data has been re-enabled.

Host the catalogs on a public cloud service

Google Drive cancelled our free plan and Github is limited to 25MB per file :-( Nevertheless bandwith is currently not an issue, so we might continue to self-host for a while

Implement differential updates

Database design is currently under investigation: mapsforge recently introduced a blazing fast, spatialite enabled POI database extension (https://github.com/mapsforge/mapsforge/blob/master/docs/POI.md). Experiments with using mapsforge spatialite format in Radiobeacon look very promising. Using their format we possibly might solve the performance issues on long tracks (#92)

mvglasow commented 7 years ago

Thanks, looks like the DB is up again. There used to be a JSON with version information at http://radiocells.org/default/database_version.json – where did that end up?

wish7code commented 7 years ago

On 'country' level at https://radiocells.org/downloads/catalog_downloads.json.. Maybe I can somehow hack 'global' version info into that list too..

mvglasow commented 7 years ago

Ah... my use case was to check if a new version of the catalog is available as part of a shell script, which I do via a simple diff (if the remote JSON differs from the last one I got, it means there are changes). For this I could simply use the new JSON, since changes to the full catalog would mean that at least one country extract got an update. So in fact I can work around the missing "old" JSON.

Though, on the other hand, being able to download the full catalog from the backend would be useful...