openAIP / openaip

Public issue tracker of www.openaip.net.
39 stars 3 forks source link

Add world files to the Google Cloud Storage bucket #317

Open kebekus opened 10 months ago

kebekus commented 10 months ago

Dear Stephan,

As far as I can see, the GCS bucket contains nation-specific files with aviation data. Would it be possible to generate similar files that contain the data for the whole world?

I need this data daily to generate aviation maps for Enroute Flight Navigation. Currently, I use the REST JSON API, which is cumbersome. Because of the limit on the maximum item count retrieved per page, I have to split my requests into zillions of smaller ones that download one page each. It would be much easier for me (and produce less load on your server) if I could download GeoJSON files with all the relevant data from GCS.

Best wishes · Thanks again,

Stefan.

reskume commented 10 months ago

Hi Stefan,

do you use the openAIP Core API to download content for the Enroute app or do you use GCS with a dedicated library to download the geojson files from there? If you use the openAIP Core API, it may be beneficial to get all the geojson files via GCS. You can also create a filter for the files you want and then simply loop over all files and download them in parallel. Meanwhile I will think a bit about adding world files but this isn't an easy one since adding world exports will, to some extent, degrade the resilience of the system against unexpected outages and service disruptions. Currently each task will be re-scheduled until it is successfully done. Since each export (except the US one) doesn't take much time, this approach scales very well. But if we create world exports, the probability of problems with the exports will drastically increase and will require a more complex solution to reach the currently level of reliability.

Cheers,

Stephan

kebekus commented 10 months ago

Dear Stephan,

I do use the core API, and I will experiment with downloads from GCS, as you suggested. Thanks!

Best wishes,

Stefan.

BlackFlash5 commented 9 months ago

I'd be interested in world files as well.

You can also create a filter for the files you want and then simply loop over all files and download them in parallel.

Might be a dumb question, but wouldn't this be a solution for you as well if you are unsure about the resilience of your own service? After every export task is completed, you could pull everything, merge it and export the merged world files. This way x developers wouldn't have to work around this by pulling a file for each country and merge them themselves.

reskume commented 9 months ago

The current situation is that we are not able to merge data within our current logic. We could only create the complete list of country files and then combine them into world files but since one export run contains > 1000 export files, the exports do happen in parallel and each job is decoupled from each other to have the best resilience against possible service outages etc and not loose something on the way. This makes is currently impossible to know when every export is finished without adding additional logic to control that.

I don't say that this is impossible to implement, but the effort to get it right and be reliable is simply not worth the benefit. We have a 1000 item limit on the API endpoints. For example, if you want to download the complete airports data it will require a client to loop over all pages and download in parallel which will be total count of 46 requests. At least for me, when run in parallel, this happens quite fast and requires 6 lines of code to do this. The same code can be used to request the whole data for each provided endpoint by just swapping out the requested URL.

Also, please be aware that the bucket files are created every day but only once. So, if you want the newest data you always have to use the API. And in the worst case, the exports are more than one day behind the API if there has been any problems with the exports and the system keeps them back until it sees fit to try them again.

BlackFlash5 commented 9 months ago

My main reason for world files is tile data. Getting all tiles would take ages through the API, create avoidable load for your service and you can't get just one layer. Currently I only need the airport layer, so running my own tile server to reduce load times/bandwidth for users becomes an interesting topic. When running my own "tile server" I wouldn't need the latest data and I'd be perfectly happy with a daily update.

But since this feature seems to be out of scope for now, might I ask one off-topic question? I couldn't find any information on if tiles can overlap at country borders when downloading all mbtiles-files and merging them into one dataset. Do you have to detect and merge overlapping tiles, or are they sliced in a way so you don't have to worry about that?

reskume commented 9 months ago

The original request in the ticket is about geojson files. When I described the export logic, it was about all text based files. The mbtiles files are created separately using Tippecanoe. I can have a look into that and check if it can be refactored so a world mbtiles files is also created. Personally I would not use the export data as it is not flexible enough. I would go with dowloading the data from the API, then either removing or adding properties that are required for styling the map and then create the mbtiles. The production openaip map contains several additional geometries internally to be able to Style the map as it is. Styling it like this with the original data is not possible for example.

Regarding the tiles, I suppose that the tiles are ok for the specific country. I have not looked into the possibility to merge mbtiles files. If this is possible, I guess the library has to be very smart to actually stitch the areas together...

BlackFlash5 @.***> schrieb am Do., 30. Nov. 2023, 17:07:

My main reason for world files is tile data. Getting all tiles would take ages through the API, create avoidable load for your service and you can't get just one layer. Currently I only need the airport layer, so running my own tile server to reduce load times/bandwidth for users becomes an interesting topic. When running my own "tile server" I wouldn't need the latest data and I'd be perfectly happy with a daily update.

But since this feature seems to be out of scope for now, might I ask one off-topic question? I couldn't find any information on if tiles can overlap at country borders when downloading all mbtiles-files and merging them into one dataset. Do you have to detect and merge overlapping tiles, or are they sliced in a way so you don't have to worry about that?

— Reply to this email directly, view it on GitHub https://github.com/openAIP/openaip/issues/317#issuecomment-1834070134, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABROSQBHBSLDFPM4T5CF7ALYHCVK7AVCNFSM6AAAAAA6XGZI7GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZUGA3TAMJTGQ . You are receiving this because you were assigned.Message ID: @.***>