microsoft / GlobalMLBuildingFootprints

Worldwide building footprints derived from satellite imagery
Other
1.38k stars 202 forks source link

Issue with Preserving 'height' Attribute When Downloading Data Using Example Notebook Script #92

Closed wanhanlong1130 closed 7 months ago

wanhanlong1130 commented 8 months ago

Hi,

1. Q1 I encountered an issue when using the provided example notebook script to download data, specifically regarding the preservation of the 'height' attribute in the output. In the process, the 'height' and 'confidence' information within the 'properties' of each GeoJSON feature was replaced by an 'id' value, resulting in the loss of these attributes in the output file.

The problematic section is found in Step 3, lines 129 to 134:

for fn in tmp_fns:
    with fiona.open(fn, "r") as f:
        for row in tqdm(f):
            row = dict(row)
            shape = shapely.geometry.shape(row["geometry"])

            if aoi_shape.contains(shape):
                if "id" in row:
                    del row["id"]
                row["properties"] = {"id": idx}
                idx += 1
                combined_rows.append(row)

This code block overwrites the original 'properties' values with a new 'id' value, omitting the 'height' and 'confidence' information.

As an alternative approach, I have utilized a GeoDataFrame to preserve the 'height' attribute, as shown below:

for fn in tmp_fns:
    gdf = gpd.read_file(fn)
    gdf = gdf[gdf.geometry.within(aoi_shape)]
    gdf['id'] = range(idx, idx + len(gdf))
    idx += len(gdf)
    combined_gdf = pd.concat([combined_gdf, gdf], ignore_index=True)

This modification allows for the inclusion of the 'height' attribute in the output GeoJSON file.

2. Q2 Furthermore, after implementing the above fix and extracting the 'height' attribute from 'properties', I observed that all footprints from the example coordinates (presumably located in Seattle, WA) have a 'confidence' value of '-1.0'. I am seeking clarification or assistance in understanding why this is the case and how it might be resolved. id type properties geometry
0 Feature {'height': 6.270849704742432, 'confidence': -1.0} POLYGON ((-122.06576 47.62193, -122.06579 47.6...
1 Feature {'height': 10.315505981445312, 'confidence': -1.0} POLYGON ((-122.06649 47.66738, -122.06628 47.6...
2 Feature {'height': 4.21063232421875, 'confidence': -1.0} POLYGON ((-122.06902 47.65408, -122.06920 47.6...
3 Feature {'height': 6.376638889312744, 'confidence': -1.0} POLYGON ((-122.06992 47.64544, -122.06993 47.6...
4 Feature {'height': 4.114572525024414, 'confidence': -1.0} POLYGON ((-122.07062 47.62562, -122.07050 47.6...

Thank you for any help or insights you can provide.

andwoi commented 7 months ago

@wanhanlong1130 thanks for noting the issue with the sample notebook. Can you submit a PR with your update? Regarding confidence, -1.0 is a placeholder value since we will not backfill confidence for all predictions but we are including for any new updates as of the December 2023 update.

wanhanlong1130 commented 7 months ago

@andwoi Thanks for your reply! I just created a pull request. This database is very useful!