allenai / satlas-super-resolution

Apache License 2.0
190 stars 24 forks source link

Meaning behind the Training data's folder structure #22

Closed VaasuDevanS closed 4 months ago

VaasuDevanS commented 4 months ago

Thanks very much to the authors for making this work in the public domain. I have few questions regarding the folder structure.

Currently the folder structure for an image looks something like this:

├───naip
│   └───m_2608053_sw_17_060_20191203
│       └───36226_55652
│               rgb.png
│
└───sentinel2
    └───36226_55652
            b01.png
            b05.png
            b06.png
            b07.png
            b08.png
            b09.png
            b10.png
            b11.png
            b12.png
            tci.png

I would like to know the meaning behind the naip folder name. I think it's safe to assume sw stands for south-west and 20191203 is the date of NAIP image (Dec 03, 2019). What about m_2608053, 17_060 and 36226_55652?

I am trying to locate the original NAIP image geographically for further analysis.

piperwolters commented 4 months ago

Hi, thanks for your interest in this project!

The 36226_55652 part of the filepath is the Web Mercator tile at zoom 17. So to convert that to the geographic coordinate, you could use a function like this:

def mercator_to_geo(p, zoom=17, pixels=1):
    n = 2**zoom
    x = p[0] / pixels
    y = p[1] / pixels
    x = x * 360.0 / n - 180
    y = math.atan(math.sinh(math.pi * (1 - 2.0 * y / n)))
    y = y * 180 / math.pi
    return (x, y)
VaasuDevanS commented 4 months ago

Thanks very much @piperwolters. Could you also please explain what m_2608053 and 17_060 stands for?

piperwolters commented 4 months ago

Of course. You are correct about the date and the quadrant indicator (ex. sw).

m: This stands for "mosaic," indicating that the image is a part of a mosaic of tiles stitched together to cover a larger area.

2608053: This is a unique identifier for the specific tile. It typically includes information related to the location and quadrant of the imagery. For NAIP imagery, this number is often associated with the USGS (United States Geological Survey) quadrangle mapping system or a similar grid system used to segment and catalog the imagery.

17: This number represents the zoom level of the image. These NAIP images are on a web-mercator grid of zoom level 17 (2^17 x 2^17).

060: This part of the filename usually pertains to the version or series of the imagery. It could represent a batch number or a version number of the data release, helping users distinguish between different collections or updates of imagery.

VaasuDevanS commented 4 months ago

Thanks very much @piperwolters for the detailed clarification.