onthegomap / planetiler

Flexible tool to build planet-scale vector tilesets from OpenStreetMap data fast
Apache License 2.0
1.46k stars 116 forks source link

[BUG] Download error when regenerating OpenMapTiles #135

Open lokkelvin2 opened 2 years ago

lokkelvin2 commented 2 years ago

Describe the bug On Windows 10, when generating using scripts/regenerate-openmaptiles, an error is thrown during the download of omt yaml files.

Exception in thread "main" java.io.FileNotFoundException: https://raw.githubusercontent.com/openmaptiles/openmaptiles/v3.12.2/layers\water\mapping.yaml

image Pasting the suspect URL into Chrome works fine. Turns out that accessing this URL through Java returns a 404 for this particular .yaml file.

This bug results from the forward-slash and backslash in the URL (probably due to windows file separator).

To fix this, I added url = url.replaceAll("\\\\", "/"); above line 108 of Generate.java .

https://github.com/lokkelvin2/planetiler/blob/0214605799655c28f0a33a6714ebb0b19eaf38da/planetiler-basemap/src/main/java/com/onthegomap/planetiler/basemap/Generate.java#L108

Note that I have not tested the impact of this patch on non-windows machines.

wipfli commented 2 years ago

Thanks for taking the time to submit this bug report. Probably it is best to solve this problem at the source, i.e., where mappingPath is generated. Would you like to submit a pull request with the following change?

         if ("imposm3".equals(datasource.type)) {
-          String mappingPath = Path.of(layerFile).resolveSibling(datasource.mapping_file).normalize().toString();
+          String mappingPath = Path.of(layerFile).resolveSibling(datasource.mapping_file).normalize().toString().replaceAll("\\\\", "/");
           imposm3MappingFiles.add(base + mappingPath);
         } else {
wipfli commented 2 years ago

This works on my linux machine...

lokkelvin2 commented 2 years ago

Before I make submit a PR, I am trying to fix another related bug.

It appears that the requests for .yaml files are getting cached, so the Generate script does not always pull the latest commits from github. This issue had me questioning why certain OMT changes were not being reflected on my planetiler output.

One workaround is to change the --"tag" every generation so that a new url is always used.

wipfli commented 2 years ago

I also had the impression once that the requests get cached, but then later on this was not the case anymore...

msbarry commented 2 years ago

Thanks for reaching out, I agree it's best to fix where @wipfli points out the local disk path gets converted to a URL. The Path is just a list of parts that get joined by the platform-specific path separator (\ on windows) so I think we just want to change that code to explicitly join path parts with /.

Also, it would be good to run the "regenerate" workflow on windows too to catch this kind of issue. - or we could just add the steps it runs to the end of the build job.

msbarry commented 2 years ago

For the cache issue, the connection gets created in this method:

https://github.com/onthegomap/planetiler/blob/cce51668f296745dea264cd1a3f4caea1ccc27b7/planetiler-core/src/main/java/com/onthegomap/planetiler/util/Downloader.java#L100-L107

You can also add connection.setUseCaches(false); to that to tell it not to cache anything. That method gets used to download input files, the geofabrik index, AWS index to get latest planet file, and this generator so I'm not sure if would make sense to bypass the cache for all of those, add a new config parameter, Or if it should be a parameter that the caller sets...?