Earth-Information-System / fireatlas

3 stars 1 forks source link

V3: FlatGeobufs have empty CRS(s) for some reason #32

Closed ranchodeluxe closed 2 weeks ago

ranchodeluxe commented 1 month ago

Problem

Some of the snapshot FlatGeobuf outputs have empty CRS(s) for some reason that I can't trace back to the AllFires GDF. This only affects the .fgb ingest workflow so I changed it back to .gpkg workflow (which doesn't have the same problems)

ranchodeluxe commented 1 month ago

Will just have to explicitly set the CRS, shouldn't be hard

jsignell commented 1 month ago

Could it be that we are just not be setting the CRS on the geodataframe before writing the flatgeobuf?

zebbecker commented 3 weeks ago

I looked into which outputs this affects. When I load the different outputs that are saved as fgbs into a GeoDataFrame and look at the .crs property, here is what I am getting now:

Largefire outputs:

CombinedLargefire outputs:

Snapshot outputs:

Here is what I think might be going on:

When we create these layers here (for Largefires) and here (for Snapshots), they are implicitly created as plain pandas dfs without a CRS associated.

A few lines later, they are transformed into GeoDataFrames, again implicitly: data = data.set_geometry("geometry").

set_geometry tries to set the CRS for the new GeoDataFrame based on the CRS of the geometry column. However, only the perimeter columns have a CRS directly associated with them in the allfires gdf we copy the selected column from, because they are the active geometry in that gdf. So, for all the other columns, there is no CRS to be copied over- this is why we end up with an empty CRS in the fgb files for those layers.

Like @ranchodeluxe said, this is an easy fix, for example we could just do this: data = data.set_geometry("geometry", crs=settings.EPSG_CODE).

However, while that fixes the empty CRS issue, it also results in all outputs having the behavior I am investigating in Issue 61. So, keeping this open for now.