mozilla / ichnaea

Mozilla Ichnaea
http://location.services.mozilla.com
Apache License 2.0
570 stars 139 forks source link

Automate map tiles generation #67

Closed hannosch closed 10 years ago

hannosch commented 10 years ago

As part of the new map (#33) we create png map tiles. This is currently a manual process based on the steps at https://gist.github.com/hannosch/0b80b01d260cc2d41957 or the map.sh script that currently exists only on the server (the last step in the gist is optional, as we don't use a proper tile server yet, but just let nginx serve a nested directory tree).

This process is not documented, not checked into source control and generally an ad-hoc hack. We should make it better (tm).

Things to investigate:

hannosch commented 10 years ago

For hosting the tiles: One other option it to upload mbtiles files to mapbox (only found a manual option so far, asked our mapbox contact about an actual API -> https://www.mapbox.com/help/#upload-mbtiles-file). Prices at https://www.mapbox.com/plans/ - we have an enterprise plan. One advantage might be serving all three layers (base, blue dots, labels) from one source, and there's apparently some magic on the mapbox side which means they while merge the three layers into a single image file before serving them to the user.

hannosch commented 10 years ago

One more feature request: It would be nice to get the map generation time written into a file at a known URL or write it back to a database table. So the website code could pick it up and display a prominent text with the date on the /map page.

mhrivnak commented 10 years ago

Hi, I'm a new user who has two thoughts to contribute to this topic.

First, I think you can get more participation by rebuilding the map more often. This comment is really just encouraging you to follow through on improving your build process. :)

I got lucky that within a few hours of my first "stumbling" drive, the data appeared on the map. Such quick gratification felt very rewarding, as I could literally see that I had made a difference in the small part of the world that I care about most. Reading about the build process however, it seems that a couple of days is a more common time frame for seeing new data hit the live map.

Fresh data encourages me in two ways:

  1. Before going out, I can see that the route I'm going to take is not already covered. I can blaze a new trail! Exciting! The fresher the data, the more confident I can be that I'm adding new data, and thus the more important I will feel.
  2. Seeing the trail I blazed in step 1 go live makes me feel like I accomplished something. People love that kind of thing. For most users, especially newbies, this will be far more motivational than a rank on the leader board.

The suggestion above from @hannosch about adding a build timestamp to the map page is a great idea.

mhrivnak commented 10 years ago

Second, I think you'll benefit a lot from setting up a proper build system like http://jenkins-ci.org/ or http://buildbot.net/.

There's an up-front cost for getting it deployed and setup. But once it's operational, you can configure it to build on any schedule you like (every few hours perhaps?). And you can kick off additional manual builds with the push of a button. Using proper CI could also help you scale the build process to multiple machines if that ever becomes necessary.

Being completely new to this project, I have no idea what's been discussed or what's available resource-wise. Maybe I'm preaching to the choir, but suffice it to say, I think you'll be VERY happy if you take the time to setup a proper build system to handle tile generation.

This is a cool project. Thanks for your work!

cpeterso commented 10 years ago

Thanks for your suggestions, @mhrivnak! We are brainstorming some map features in the MozStumbler app itself to guide people to places that no one has visited yet, so you could see your contributions mapping new territory in real-time. :)

hannosch commented 10 years ago

@mhrivnak thx for your suggestions! We looked at a couple different options for automation. It looks like we'll go back to the age old proven cron job for this. How often the map is regenerated is a question of resource usage. On an 8-core machine it currently takes a bit more than an hour to render out all the map tiles (about 7gb) and that time will increase with more incoming data.

So our current thinking is to render out the full map once a day, the same way the other stats are gathered. Once we have that we can look at smarter ways to do partial updates of the map and schedule those more often.

hannosch commented 10 years ago

I added a cron job to the current server and it regenerates the coverage map daily after midnight in UTC - with an hour runtime the map should be updated by 1:30am UTC.

Djfe commented 10 years ago

Nice, thanks a lot, but the text above the coverage map should get updated, too. It still says, that the map only gets updated manually. (EDIT: the announcement above the map)

Djfe commented 10 years ago

I think you could save up a lot of time and resources by rendering only the tiles of places that got updated recently (in the last 24 hours) dunno how the data is stored but if it's a database than it should be pretty simple just a query for all locations which where uploaded in the last 24h (where date>=timestamp-24h or something similar) should speed up the process of rendering a lot because it wouldn't need to rerender big parts of Africa (for example)

Djfe commented 10 years ago

@hannosch what do you think about my last post? Will it save time or does it take more time to estimate which map tiles need to be replaced?

hannosch commented 10 years ago

@rtilder has been looking into this, forgot to assign him.

Stuff that's still missing:

hannosch commented 10 years ago

We are going with S3 for map tiles for now.

Talked with Dean and we came up with the following S3 config (in ichnaea.ini, ichnaea section):

assets_url = https://location-assets.services.mozilla.com
s3_assets_bucket = com.mozilla.services.location-assets/

In dev/stage the url / bucket name would include net.mozaws.dev or net.mozaws.stage.

Our map.js would than load the image tiles from https://location-assets.services.mozilla.com/tiles/{z}/{x}/{y}.png and we'd need to adjust the CSP to allow loading images from the assets domain.

hannosch commented 10 years ago

Started the implementation of this at hannosch/ichnaea@903d96f7a7fd8ab5fd444057e2eead36e5153dca (the 67-map-tile-generation branch).