kylebarron / usgs-topo-tiler

Python package to read Web Mercator map tiles from USGS Historical Topographic Maps
https://kylebarron.dev/usgs-topo-mosaic
MIT License
77 stars 3 forks source link

Support arbitrary scales/grids, not just quads #8

Closed kylebarron closed 4 years ago

kylebarron commented 4 years ago

Crosstab of grid size vs scale:

pd.crosstab(gdf['Scale'], gdf['Grid Size'], margins=True).to_markdown()
Scale 1 X 1 Degree 1 X 2 Degree 1 X 3 Degree 1 X 4 Degree 15 X 15 Minute 2 X 1 Degree 3.75 X 3.75 Minute 30 X 30 Minute 30 X 60 Minute 7.5 X 15 Minute 7.5 X 7.5 Minute None All
10000 0 0 0 0 0 0 76 0 0 0 1 0 77
12000 0 0 0 0 0 0 1 0 0 0 0 0 1
20000 0 0 0 0 3 0 0 0 0 0 295 0 298
21120 0 0 0 0 0 0 0 0 0 0 19 0 19
24000 0 0 0 0 4 0 0 0 0 0 131183 0 131187
25000 0 0 0 0 0 0 0 0 0 389 1157 2 1548
30000 0 0 0 0 0 0 0 0 0 0 276 0 276
31680 0 0 0 0 5 0 0 0 0 0 4273 0 4278
48000 0 0 0 0 1034 0 0 0 0 7 90 0 1131
50000 0 0 0 0 94 0 0 0 0 0 0 0 94
62500 0 0 0 0 27997 0 0 0 0 0 2 0 27999
63360 0 0 0 0 7082 0 0 0 0 0 0 0 7082
96000 0 0 0 0 1 0 0 42 0 0 0 0 43
100000 0 0 0 0 0 0 0 8 2963 0 0 0 2971
125000 0 0 0 0 0 0 0 4657 1 0 0 0 4658
192000 0 0 0 0 0 0 0 2 0 0 0 0 2
250000 495 3021 945 20 0 34 0 10 0 0 0 0 4525
All 495 3021 945 20 36220 34 77 4719 2964 396 137296 2 186189
kylebarron commented 4 years ago

Grid size - Scale combinations:

[['7.5 X 7.5 Minute', 10000],
 ['7.5 X 7.5 Minute', 20000],
 ['7.5 X 7.5 Minute', 21120],
 ['7.5 X 7.5 Minute', 24000],
 ['7.5 X 7.5 Minute', 25000],
 ['7.5 X 7.5 Minute', 30000],
 ['7.5 X 7.5 Minute', 31680],
 ['7.5 X 7.5 Minute', 48000],
 ['7.5 X 7.5 Minute', 62500],
 ['3.75 X 3.75 Minute', 10000],
 ['3.75 X 3.75 Minute', 12000],
 ['15 X 15 Minute', 20000],
 ['15 X 15 Minute', 24000],
 ['15 X 15 Minute', 31680],
 ['15 X 15 Minute', 48000],
 ['15 X 15 Minute', 50000],
 ['15 X 15 Minute', 62500],
 ['15 X 15 Minute', 63360],
 ['15 X 15 Minute', 96000],
 ['7.5 X 15 Minute', 25000],
 ['7.5 X 15 Minute', 48000],
 ['None', 25000],
 ['30 X 30 Minute', 96000],
 ['30 X 30 Minute', 100000],
 ['30 X 30 Minute', 125000],
 ['30 X 30 Minute', 192000],
 ['30 X 30 Minute', 250000],
 ['30 X 60 Minute', 100000],
 ['30 X 60 Minute', 125000],
 ['1 X 2 Degree', 250000],
 ['1 X 3 Degree', 250000],
 ['1 X 1 Degree', 250000],
 ['1 X 4 Degree', 250000],
 ['2 X 1 Degree', 250000]]

Reproduction code:

    df = pd.read_csv(path)

    # Keep only historical maps
    # Newer maps are only in GeoPDF, and not in GeoTIFF, let alone COG
    df = df[df['Series'] == 'HTMC']

    df['geometry'] = df.apply(construct_geometry, axis=1)
    gdf = gpd.GeoDataFrame(df)

# or just use df here
grid_sizes = gdf['Grid Size'].unique()
grid_size_scales_pairs = []
for grid_size in grid_sizes:
    grid_size_df = gdf[gdf['Grid Size'] == grid_size]
    scales = grid_size_df['Scale'].unique()
    for scale in scales:
        grid_size_scales_pairs.append([grid_size, scale])
kylebarron commented 4 years ago

Code to visualize each combination:

cols = ['Map Name',
 'Scale',
 'geometry',
 'W Long',
 'S Lat',
 'E Long',
 'N Lat',
 'Download Product S3']
for grid_size, scale in grid_size_scales_pairs:
    if grid_size == 'None':
        continue

    gdf_part = gdf.loc[(gdf['Grid Size'] == grid_size) & (gdf['Scale'] == scale)]
    Visualize(gdf_part[cols])
kylebarron commented 4 years ago

To generate automated test cases:

test_cases = []
n = 3
for grid_size, scale in grid_size_scales_pairs:
    if grid_size == 'None':
        continue

    gdf_part = gdf.loc[(gdf['Grid Size'] == grid_size) & (gdf['Scale'] == scale)]

    # Choose random 3
    choose_n = min(len(gdf_part), n)

    sample = gdf_part.sample(choose_n)

    urls = sample['Download Product S3']
    urls = urls.str.replace('PDF', 'GeoTIFF')
    urls = urls.str.replace('.pdf', '.tif')

    # Remove scale folder; not used with geotiff
    urls = urls.str.replace(f'{scale}/', '', n=1)

    # Df of bounds
    bounds_df = sample.geometry.bounds.values

    for url, bounds in zip(urls, bounds_df):
        test_cases.append([url, list(bounds), scale, grid_size])
kylebarron commented 4 years ago

I don't think there will ever be a non-hacky way to discover the image's true map bounds on the fly. For example here in northern Alaska, the right border is so wide that it takes up the space of 1 degree of longitude (which in northern Alaska is relatively small).

image

So I think the best way forward is to "hack" the asset string to be encoded JSON, which then has "url" and "mapBounds" keys internally, so no parsing will be needed.