mathuin / TopoMC

Building Minecraft worlds from USGS NED and NLCD topographical data
http://www.mathuin.org/TopoMC/
46 stars 11 forks source link

Adding Orthoimagery Support #32

Open KermMartian opened 11 years ago

KermMartian commented 11 years ago

I'm looking to patch in support for fetching orthoimagery along with the landcover and elevation data, and use the data from landcover + orthoimagery combined to pick the blocks to be used for the surface terrain on the final MC map. Do you have any suggestions or comments on whether the USGS services that region.py pulls from offer such data? From looking through their websites, it looks like a qualified "yes", but I'm having a great deal of difficulty finding documentation on the API(s) they offer. Thanks in advance, and I will be more than happy to send you a pull request if and when I get what I'm trying to do working if it's something you'd be interested in having in the TopoMC trunk.

mathuin commented 11 years ago

This is an interesting idea!

The best way to see what the USGS has to offer is to use http://viewer.nationalmap.gov/example/eros_inventory_proxy_json_v2.html. For example, the Provincetown region has a product with the key "P19" and the title "NAIP (4 Band) UTM Zone 19N". Select the product, and you can choose one of several slices of the Provincetown area to download. Once you've downloaded the file, you will have to mess about with it to figure out exactly how it is formatted. The metadata (available at http://extract.cr.usgs.gov/distmeta/servlet/gov.usgs.edc.MetaBuilder?TYPE=HTML&DATASET=NAIP_Z19_G4 for that type of file) will give you some information, but you'll need to use some GIS software or the GDAL binaries to really understand the file: things like what projection and scale are being used, what the pixel values mean for each band, all that stuff. Once you've done that, you can reproject the file to match the elevation and landcover files and figure out what block you want each pixel to represent.

After you've got that figured out, and the map turns out the way you want it to, you'll need to sample other parts of the country to determine all the product keys that return your imagery of choice (I imagine there's one per UTM zone, whatever that means), and start hacking region.py. Start by adding a new key-value pair to productIDs in region to support imagery -- the list of product keys you just generated goes here. If there are different types of imagery based on scale, you'll want to write some code to handle that -- use the elevation stuff as a template. Add another line to the end of the init routine like the ones for lclayer and ellayer, and go through the code adding your third layer to all the stuff that calls those two. Translate the pixels if necessary to a desired format, and add another layer to the existing Map.tif (there are currently four -- elevation, landcover, crust, and bathy). Then you'll have to modify terrain.py and friends to place the proper block on the surface to match the pixel of imagery.

It'll be a lot of work, but if you do it right, it'll make a big difference. Let me know if you need more help!

Jack.

KermMartian commented 11 years ago

Thanks for the very thorough feedback! I have gotten started trying to understand the data that EROS provides. I understand now that the TIFFs provide (R, G, B, IR), which should be interesting for things like figuring out where to place trees, but one thing I'm stuck on is figuring out how you got a product key of P19. I see various Zone IDs ranging from Zone 12 to Zone 19; are those the product keys? Or is there somewhere else I should be looking.

mathuin commented 11 years ago

The product key came from hacking region.py actually -- I knew I forgot something! Add 'SEAMTITLE' to the desiredAttributes value in region.py and then run ./GetRegion.py on the region of your choice with --debug enabled. You will get a ton of XML back and forth between you and the USGS, and one of those big XML blobs will contain the title and product key for every product offered by the USGS for that region. P19 is the product key for the product mentioned above.

Sorry about that! For what it's worth, the reason I had such a detailed response is that I'm looking into adding the 'terrestrial ecosystems' layer to properly set the biomes, and the process is the same. I'm not yet sure if that's better or worse than the landcover values, but I'm experimenting. :-)

Jack.

mathuin commented 11 years ago

http://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system

Fun fact: there are ten UTM zones across the continental US. No idea about Puerto Rico, Hawaii, or Alaska. The GDAL transformations should be pretty straightforward from the UTM zone projection to the one used by TopoMC.

KermMartian commented 11 years ago

Thanks for your continued help. I got as far as puling the orthoimagery, joining and warping it, and applying that data onto the terrain. The color/materials that appear on the terrain match the orthoimagery TIF, which is great, but unfortunately that imagery is missing several large areas (and also seems to be scaled incorrectly). For the scaling, I may have made coding errors; for the missing areas, is it possible that some data is available in some sub-regions and not in others, and may need to be overlapped with other products? I notice that if I decrease the tile size and the scaling, the process fails entirely, as it seems none of my imagery (P10-P18, OF9, and ONF) is available over the entire shrunken region. Thanks in advance! I'd be happy to provide code if it helps.

mathuin commented 11 years ago

Can you attach some images? Specifically, the TIF and the test terrain would help.

If you hack region.py and add SEAMTITLE, any product you find that way will cover the entire region -- not always in one file of course -- but it may need tweaking with projections and such.

Also, have you looked at the TIF and the elevation/landcover files with something like qgis?

KermMartian commented 11 years ago

I have looked at all the files with MapWindow GIS on Windows, and I'm starting to understand how all the data (and metadata) fits together. I found the reason that the data appeared to be missing (P10-P19 have STATUS=Seamless, not STATUS=Tiled), so I must modify my question slightly: How can I use Seamless-type products, if you know? With the current code, the file list request for P18 (the elided coordinates are entirely within P18):

<REQUEST_SERVICE_INPUT><AOI_GEOMETRY><EXTENT><TOP>xxxx</TOP><BOTTOM>yyyy</BOTTOM><LEFT>zzzz</LEFT><RIGHT>aaaa</RIGHT></EXTENT><SPATIALREFERENCE_WKID/></AOI_GEOMETRY><LAYER_INFORMATION><LAYER_IDS>P18</LAYER_IDS></LAYER_INFORMATION><CHUNK_SIZE>250</CHUNK_SIZE><JSON></JSON></REQUEST_SERVICE_INPUT>

Returns an error:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><REQUEST_SERVICE_RESPONSE><ORIGINATOR>RVS</ORIGINATOR><ERROR><FIELD_NAME>Ipc</FIELD_NAME><ERROR_MSG>Invalid product code: </ERROR_MSG></ERROR></REQUEST_SERVICE_RESPONSE>

TIA!

mathuin commented 11 years ago

Seamless content is a little different than tiled content. Before last December, TopoMC used seamless content, so you can look at the history of region.py to see what I was doing there. I had to change because elevation and landcover were no longer available as seamless content. That will make your work a little more frustrating, I fear.

How exactly did you retrieve the TIF? Any chance you can attach it to the issue, or give me a URL so I can get it?

KermMartian commented 11 years ago

The TIF I had had was from layer OF9, which is shady. P10-P19 are much better, and thanks to pulling in a great deal of download code from your pre-March region.py, I was able to get seamless data to be downloaded and cached. The code is ugly and needs to be cleaned up, but it works. Here's a 50%-scaled terrain using just R/G/B data without bothering with landcover types and taking advantage of the IR channel: 2013-06-28_20 24 45

mathuin commented 11 years ago

That is really, really neat! I wonder how it'll look if you add in elevation?

Jack.

KermMartian commented 11 years ago

Actually, that's with all the layers including elevation already there. Here's another pair of renderings (from DynMap this time): http://www.cemetech.net/forum/viewtopic.php?p=206390#206390 . The water in the reservoir got replaced accidentally, but that's a very easy fix. My next challenge is that when I try to fetch the orthoimagery for the entire island of Manhattan, the USGS API tells me that my AOI is too large by roughly a factor of two. Do you know of a cleaner/easier way to resolve this than splitting the rectangular region in two (or N) pieces and fetching two (or N) TIFs using the USGS service and then passing them all to gdalwarp? TIA.

mathuin commented 11 years ago

No, that's pretty much how I'd do it. There's code for the gdalwarp merging in region.py -- it might be in the seamless stuff but it's definitely in the tiled stuff.

Have you considered only using the orthographic data when it corresponds to built-up areas (21-24) and letting landcover do the rest? That would let you get water and grass and beach while focusing the "interesting" coloration on the hardest-to-simulate-by-hand terrain...

Oh, and the dynmap stuff is awesome. Very nice work!

KermMartian commented 11 years ago

What I currently do is apply a different "palette" of blocks to landcover types 22 through 25 than to all the others, so that things like roads and paths in wooded areas (like parks) will still show up properly. I still have a bit of tweaking to do, and I might tweak it further to only include grass, sand, gravel, and a few other entities, in which case I would use the OpenStreetMap API to try to pull in and render streets and paths onto wooded/rural areas. For trees and buildings, I'm hoping to get permission from the Google Earth team to use their building and tree layers, but that's going to be difficult. I'm currently testing my building converter on single models, as you might have seen from that topic, for example with the Chrysler Building. My newest issue, from my work placing models onto the generated terrain today, is that the terrain seems to be rotated from "true" north by slightly under 15 degrees clockwise. Any insight into what might be causing that?

mathuin commented 11 years ago

http://www.charlespetzold.com/etc/avenuesofmanhattan/index.html

I would have expected you to see a rotation of about thirty degrees, based on that post. I wonder if there's some error due to the projection -- it's centered on Kansas, I think. Interesting question.

KermMartian commented 11 years ago

Based on reading up on the North American Datum, it seems to me that all points within the region I'm converting would be offset by roughly an equal amount of the issue was (for example) a NAD27/NAD83 mismatch. I see a rotation instead, so I suspect instead that something within my process of creating or warping the original input files into the VRT in the GetRegion step or warping the data in the PrepRegion step is causing the error. I note that to the extent the landcover array is correct, the bathy, orthoimagery, and elevation information all match up in the reservoir in the middle of the park in my sample data, so whatever error is occuring is applying to all of the layers. I note the rotations indicated below. rotation

mathuin commented 11 years ago

Interesting! I spent some time last night trying to find a good test region, and the best I could come up with was Washington DC -- it was planned with lots of north-south and east-west lines, so I figured it would help me determine whether the map was skewed funny -- and it is. Argh.

Any chance you can do your Collada thing with a building from Washington DC and plant it on a region there? Another nice thing about that region: you can make regions that are 1:1 scale both vertical and horizontal.

Jack.

mathuin commented 11 years ago

The skew on the Washington DC map I made was approximately 11.2 degrees counterclockwise using what I think is North Capitol Street as a reference. Interesting -- it's in a different amount and in the opposite direction from your observed skew for both Collada and the existing code.

KermMartian commented 11 years ago

I mis-spoke in my comment: I actually observe a rotation of about 15.2 degrees counter-clockwise. It's the same direction, but larger in magnitude than yours. Yes, I'll definitely try the Collada conversion this evening (I also have more-or-less up-to-date code here on GitHub if you want to play with it, but it requires my fork of your TopoMC code, which is currently in a broken state while I break out the buildmap() function into parallelizable tiles).

Edit: I should clarify that I'm considering the Collada as ground truth here, since it matches the angle of NYC streets to due North documented in places like that webpage you linked.

mathuin commented 11 years ago

I am very interested in seeing your fork once it's not broken! :-)

KermMartian commented 11 years ago

Hmm, my comment seems to have not gone through on Friday. I saw an error of about 13.5 degrees counterclockwise for Union Station in Washington, DC, so the problem is definitely replicable in other cities. On a hunch, I just made the t_srs be UTM Zone 18 instead of the Albers projection (ie, t_srs = "+proj=utm +datum=WGS84 +zone=18 +units=m" for NYC), and the streets are now at their proper 28.9-ish degree angle to the vertical. If you think that's a change worth pursuing, I'll have to modify all the instances of AEA/Albers usage to UTM. I deduced that the target projection might be a problem when I noticed that the pre-VRT images were rotated correctly, while the post-VRT images were rotated incorrectly.

In other news, my fork of your repo is now relatively stable; I'm using it to try to do a 1:1 (vertical and horizontal) model of the terrain of Manhattan. I may need to investigate parallelizing the final world stitching in some way, as that is taking 6 hours and counting.

mathuin commented 11 years ago

That's great news about the fork! I look forward to giving it a try.

When you say you changed the t_srs to UTM Zone 18, was that for the ortho or for the elevation/landcover/etc stuff? I imagine you will find that the UTM Zone 18 is more faithful to local conditions and has less distortion than a projection based on the center of the country as a whole, but I use that projection a) because it's the default for landcover and b) because eventually someone's going to make multiserver Minecraft work and I would very much like to have contiguous worlds with the same settings "just work". That being said, I have been toying with different projections for different continents and it might be interesting to see how the UTM choices work out.

As for parallelizing the final world stitching, I had several ideas on that but didn't get a chance to try any of them. The first idea was to set up a second pool for stitching and have a callback method on the first pool queue up the just-finished world to the second pool. I'm not sure how much that will slow down the world generation process, and I don't think pymclevel is thread-safe so the second pool would be limited to a single thread. The second idea was to replace the intermediate world generation with the creation of numpy arrays which would be merged before the final world is built. This is actually how I used to do things before Anvil, but the serialization code was not thread-safe (see a trend here?) so I wasn't able to dump them to file simultaneously. Feel free to take either of these ideas or both of them or (most likely) one or two of your own and give them a try!

KermMartian commented 11 years ago

I have the orthoimagery matched precisely to the landcover/elevation, so all three layers were rotated incorrectly. I therefore applied the UTM "fix" to all three of them, and it looks like it did the trick on a small swatch of terrain. I'm doing a much larger-scale test now to see if it continues to work. I'm happy sticking with AEA if it can be adjusted to not rotate the terrain, but it seems like the culprit here. If I decide to keep UTM in my fork, I'll have to make it pick an appropriate UTM zone based on the Region extents.

For the world-stitching, I was thinking of threading it in such a way that each thread would dump mca region files directly to the output directory, which would save thread-safety issues. Unfortunately, that circumvents some of the pymclevel abstractions, which may not be the best idea.

KermMartian commented 10 years ago

Jack, it's been a while; just wanted to let you know that I still am interested in pursuing this project, and am just stuck on getting building models from Google, Bing Maps, or Here.com. I wrote an article recently about my parts of the project, and I of course mentioned that I was building on your work: http://technophilicmag.com/2014/01/29/building-virtual-city-real-world/

mathuin commented 10 years ago

Your work looks fantastic! I read through the article and I think you're headed in the right direction. I do not know whether you can benefit from the OpenCL work that I've done to accelerate TopoMC -- it may or may not be helpful for you when scaling buildings to fit -- but you are welcome to give it a try. Sweet project. I look forward to seeing what happens next!