sobrus / FastLacellsGenerator

Fast lacells.db database generator script for LocalGSMBackend by n76
GNU General Public License v3.0
44 stars 10 forks source link

[Feature request] Allow to save the downloaded files and reuse the downloaded file #16

Open Iey4iej3 opened 3 years ago

Iey4iej3 commented 3 years ago

The downloaded file is large. To avoid redownloads, I propose to add an option to save the downloaded file instead of passing to the tunnel, and an option to reuse the existing download file. This is in particular useful when there are some errors (such as in RADIO and MCC setting) in the config file, and furthermore, the download of OCID database is restricted.

sobrus commented 3 years ago

Hi, there is already an option to make backup files (that will be used in case of download error) and also an option to keep source CSV files after processing. Since the files are updated daily and this script aims to update your database with fresh data, I don't think it will be that useful to reuse old files (unless there is a download error).

Iey4iej3 commented 3 years ago

Are you referring to https://github.com/sobrus/FastLacellsGenerator/blob/46ca9df8c77d7460a7e987e968ac3161284c383c/flg#L46 for the backup? This is done after processing (not before): https://github.com/sobrus/FastLacellsGenerator/blob/46ca9df8c77d7460a7e987e968ac3161284c383c/flg#L63 and https://github.com/sobrus/FastLacellsGenerator/blob/46ca9df8c77d7460a7e987e968ac3161284c383c/flg#L74 I don't see how the downloaded files (i.e. the original database) are backed up.

sobrus commented 3 years ago

OK, I see what you mean. Indeed, this script does not save raw downloaded files. While I agree that some use cases could benefit from this feature, this script is focused on speed (as the name suggests) and simplicity (so that anyone could easily modify it for his/her purposes).

It was intended to perform well on underpowered Atom device (or low end Android phone), where original n76 lacells-creator script was simply too slow. To save time, I/O and memory (especially if processing is done on tmpfs) it does preprocess data on the fly as files are downloaded. Adding this feature would slow it down considerably (as it would need to run on fairly large local files) and also reduce its multithreading capability (no other operation would be performed as the raw files are downloaded).

Moreover, most users don't change region often and use only small subset of data, so it usually makes little sense to save whole world coverage database snapshot as well.

Adding it as an option, on the other hand, would affect its code simplicity.

That's why I don't want to implement this feature here. Please consider it as a simple "base" script, but feel free to fork/modify it to match your purposes and use cases. In fact, I don't use it in this version myself. I do have heavily modified version to run as a cron job. It can automatically retry failed downloads and has extended logging - something you won't find here either.