allenai / satlas

Apache License 2.0
184 stars 19 forks source link

latest geospatial data product for renewables doesn't contain any solar farms #29

Closed rbavery closed 7 months ago

rbavery commented 7 months ago

I checked the file and it only contains nan and wind_turbines for the category. I'd expect it to have all detected solar farms to-date that are reconfirmed by the model according to this description from the data readme

For example, 2023-01.geojson contains wind turbines and solar farms that we believe are present as of January 2023. Replace YYYY-MM with latest to get the latest data.

https://github.com/allenai/satlas/blob/main/GeospatialDataProducts.md

import requests
import geopandas as gpd
import json
solar_farm_detections_url = 'https://pub-956f3eb0f5974f37b9228e0a62f449bf.r2.dev/outputs/renewable/latest.geojson'
response = requests.get(solar_farm_detections_url)
geojson_data = response.json()
gdf = gpd.GeoDataFrame.from_features(geojson_data['features'])
gdf['category'].unique()
array(['wind_turbine', nan], dtype=object)
favyen2 commented 7 months ago

It looks like the category property was missing from the GeoJSON for solar farms, thus leading to the nan that you saw.

The files have been fixed now (at least in the latest files including renewable/latest.geojson).

rbavery commented 7 months ago

the category is now non-nan but the score is nan

solar_farms['score'].unique()
array([nan])
import requests
import geopandas as gpd
import json
from pathlib import Path

solar_farm_detections_url = 'https://pub-956f3eb0f5974f37b9228e0a62f449bf.r2.dev/outputs/renewable/latest.geojson'
local_filename = 'renewable.geojson'

file_path = Path(local_filename)
if file_path.exists():
    with open(file_path, 'r') as file:
        geojson_data = json.load(file)
else:
    response = requests.get(solar_farm_detections_url)
    if response.status_code == 200:
        geojson_data = response.json()
        with open(file_path, 'w') as file:
            json.dump(geojson_data, file)
    else:
        raise Exception(f"Failed to download the file: Status code {response.status_code}")

gdf = gpd.GeoDataFrame.from_features(geojson_data['features'])

solar_farms = gdf[gdf['category']=="solar_farm"]
solar_farms['score'].unique()
favyen2 commented 7 months ago

Score is currently only available for marine infrastructure.

rbavery commented 7 months ago

Got it, why is this the case? Just curious. Are the scores not informative for solar farms because the results are not as robust as marine infra?

favyen2 commented 7 months ago

The scores for marine infrastructure were only added recently. The property is not documented yet at https://github.com/allenai/satlas/blob/main/GeospatialDataProducts.md. We do plan to add scores to the other outputs soon.