Earth-Information-System / fireatlas

3 stars 1 forks source link

Largefire projection shifted in V3 outputs #61

Closed zebbecker closed 2 weeks ago

zebbecker commented 3 weeks ago

@ranchodeluxe @eorland

When loading fire perimeters generated with V3, it looks like they are offset from V2 perimeters. The V3 Largefire perimeters are the ones that are wrong, between the two- they can be seen going over water in places where the V2 perimeters correctly follow the shoreline.

This only occurs with Largefire and CombinedLargefire fgb files generated with V3. allfires and allpixels data are projected correctly.

May be related to Issue 32

Example w/ Creek fire:

import folium
import geopandas as gpd 

v2 = gpd.read_file("~/shared-buckets/gsfc_landslides/FEDSoutput-s3-conus/WesternUS_REDO/2020/Largefire/F12649_20201106PM/perimeter.fgb")
v2 = v2.sort_values(by='t', ascending=False)
v2.t = v2.t.astype(str)

v3 = gpd.read_file("~/shared-buckets/zbecker/FEDSoutput-v3/Creek/2020/Largefire/1/perimeter.fgb")
v3 = v3.sort_values(by='t', ascending=False) 
v3.t = v3.t.astype(str)

m = folium.Map()
v2.explore(m=m, style_kwds={'fillOpacity': 0, 'color': 'black'})
v3.explore(m=m, style_kwds={'fillOpacity': 0, 'color': 'blue'})
m
Screenshot 2024-06-24 at 10 50 54 AM

V3 Largefire perimeters shifted over water

Screenshot 2024-06-24 at 10 53 47 AM
eorland commented 3 weeks ago

Hmm, so @ychenzgithub and I had a similar issue a while back and it was related to a faulty projected coordinate system, which we then fixed. I specifically remember vetting the fix and being satisfied with the resolution, which led to the data being recreated, hence the "REDO" in the path: "~/shared-buckets/gsfc_landslides/FEDSoutput-s3-conus/WesternUS_REDO/2020/Largefire/....

There is always the chance I missed something, but this is worth checking out again, especially because I'm nearly done with a draft of a paper that uses the data. One of the best ways I verified this was to overlay input VIIRS data with the Largefire pixel data. The positions of the pixels from both sources should be nearly identical. The "REDO" data passed this inspection the first time. Just to be thorough, it's worth doing again. Either way, it seems worth overlaying both versions with these input VIIRS data to see what's up...

@zebbecker I'm going to whip up a notebook for this real quick and I'll invite you to add in the v3 data once I have the v2 data displayed.

eorland commented 3 weeks ago

Sorry, I reread the above comments again:

This only occurs with Largefire and CombinedLargefire fgb files generated with V3. allfires and allpixels data are projected correctly.

Seems like the pixel data are OK then? If it's related to Issue 32, I'm wondering if the false easting/northing are being incorrectly inferred when interpreting a crs.

eorland commented 3 weeks ago

@zebbecker so check out the example at this path: shared-buckets/gsfc_landslides/Verify_VIIRS_FEDS_Agreement.ipynb

I overlaid the VIIRS pixels (red, top image) with the FEDS pixel data (black, bottom image):

snpp_pixels feds_pixels

Aside from a few extra pixels in the FEDS data (I didn't load in all the VIIRS data), there appears to be a close overlap because all the red VIIRS pixels are sitting right on top of the FEDS data (shown in the top image). This leads me to think that it's definitely a v3 issue, as I think you've already been able to confirm yourself.

zebbecker commented 3 weeks ago

Thanks for the VIIRS example @eorland! I used your notebook to compare VIIRS pixels against pixels from both allpixels and Largefire/nfplist.fgb files generated by v3, and neither has this projection shift issue. This is interesting because the nfplist.fgb outputs are generated with postprocess.save_large_fires_nplist, while the rest of the Largefire layers are generated with postprocess.save_large_fires_layers.

eorland commented 3 weeks ago

Yep, I just took a look at those functions. It seems that postprocess.save_large_fires_layers is calling save_fire_layers, which then sets a pre-existing geometry and writes it out as a .fgb. This is slightly different from save_large_fires_nplist which calls save_fire_nplist, defines a new geometry via points_from_xy and writes out the .fgb. All this seems super logical to me... I'm wondering if somehow the explicit crs info is either lost or misinterpreted when the geometries are being set in save_fire_layers. I'm guessing a possible solution could be to add crs info to line 283? This is basically verbatim what @ranchodeluxe was suggesting in Issue 32.

zebbecker commented 3 weeks ago

When I load the v3 outputs from fgb, they do have a CRS, but it is different than the CRS associated with v3 allfires. I think this is causing the projection issue, but am still working on tracking down where this difference is introduced.

Example:

Screenshot 2024-06-25 at 10 14 36 AM
eorland commented 3 weeks ago

OK yep, this is pretty similar to what @ychenzgithub and I were working on. In short, the CRS we used was somehow both deprecated and being misinterpreted as the correct one, so the crs info looked identical, but it actually had a different northing/easting, which is what it looks like is going on here. Are you able to pull up where the CRS info in v3 is being set? I'd be curious to see what it is saying.

zebbecker commented 3 weeks ago

@eorland here you go! It is a bit ugly, but I went and pulled out the CRS for a largefire perimeter immediately after it was set in postprocess.py. That is the first JSON object below. The second is the CRS for allfires, pulled out at the same point. They look to be the same at this point, and the built in equals methods agrees.

The only thing that happens after this point is the data gdf being saved to fgb format. I'm worried something might be going on with the encoding/decoding of the CRS during the file write or read.

Example:

Setting CRS for perimeter: 

Set data.crs as:
{
    "$schema": "https://proj.org/schemas/v0.6/projjson.schema.json",
    "type": "ProjectedCRS",
    "name": "NAD27 / US National Atlas Equal Area",
    "base_crs": {
        "name": "NAD27",
        "datum": {
            "type": "GeodeticReferenceFrame",
            "name": "North American Datum 1927",
            "ellipsoid": {
                "name": "Clarke 1866",
                "semi_major_axis": 6378206.4,
                "semi_minor_axis": 6356583.8
            }
        },
        "coordinate_system": {
            "subtype": "ellipsoidal",
            "axis": [
                {
                    "name": "Geodetic latitude",
                    "abbreviation": "Lat",
                    "direction": "north",
                    "unit": "degree"
                },
                {
                    "name": "Geodetic longitude",
                    "abbreviation": "Lon",
                    "direction": "east",
                    "unit": "degree"
                }
            ]
        },
        "id": {
            "authority": "EPSG",
            "code": 4267
        }
    },
    "conversion": {
        "name": "US National Atlas Equal Area",
        "method": {
            "name": "Lambert Azimuthal Equal Area (Spherical)",
            "id": {
                "authority": "EPSG",
                "code": 1027
            }
        },
        "parameters": [
            {
                "name": "Latitude of natural origin",
                "value": 45,
                "unit": "degree",
                "id": {
                    "authority": "EPSG",
                    "code": 8801
                }
            },
            {
                "name": "Longitude of natural origin",
                "value": -100,
                "unit": "degree",
                "id": {
                    "authority": "EPSG",
                    "code": 8802
                }
            },
            {
                "name": "False easting",
                "value": 0,
                "unit": "metre",
                "id": {
                    "authority": "EPSG",
                    "code": 8806
                }
            },
            {
                "name": "False northing",
                "value": 0,
                "unit": "metre",
                "id": {
                    "authority": "EPSG",
                    "code": 8807
                }
            }
        ]
    },
    "coordinate_system": {
        "subtype": "Cartesian",
        "axis": [
            {
                "name": "Easting",
                "abbreviation": "X",
                "direction": "east",
                "unit": "metre"
            },
            {
                "name": "Northing",
                "abbreviation": "Y",
                "direction": "north",
                "unit": "metre"
            }
        ]
    },
    "scope": "Statistical analysis.",
    "area": "United States (USA) - onshore and offshore.",
    "bbox": {
        "south_latitude": 15.56,
        "west_longitude": 167.65,
        "north_latitude": 74.71,
        "east_longitude": -65.69
    },
    "id": {
        "authority": "EPSG",
        "code": 9311
    }
}

Compare to allfires.crs:
{
    "$schema": "https://proj.org/schemas/v0.6/projjson.schema.json",
    "type": "ProjectedCRS",
    "name": "NAD27 / US National Atlas Equal Area",
    "base_crs": {
        "name": "NAD27",
        "datum": {
            "type": "GeodeticReferenceFrame",
            "name": "North American Datum 1927",
            "ellipsoid": {
                "name": "Clarke 1866",
                "semi_major_axis": 6378206.4,
                "semi_minor_axis": 6356583.8
            }
        },
        "coordinate_system": {
            "subtype": "ellipsoidal",
            "axis": [
                {
                    "name": "Geodetic latitude",
                    "abbreviation": "Lat",
                    "direction": "north",
                    "unit": "degree"
                },
                {
                    "name": "Geodetic longitude",
                    "abbreviation": "Lon",
                    "direction": "east",
                    "unit": "degree"
                }
            ]
        },
        "id": {
            "authority": "EPSG",
            "code": 4267
        }
    },
    "conversion": {
        "name": "US National Atlas Equal Area",
        "method": {
            "name": "Lambert Azimuthal Equal Area (Spherical)",
            "id": {
                "authority": "EPSG",
                "code": 1027
            }
        },
        "parameters": [
            {
                "name": "Latitude of natural origin",
                "value": 45,
                "unit": "degree",
                "id": {
                    "authority": "EPSG",
                    "code": 8801
                }
            },
            {
                "name": "Longitude of natural origin",
                "value": -100,
                "unit": "degree",
                "id": {
                    "authority": "EPSG",
                    "code": 8802
                }
            },
            {
                "name": "False easting",
                "value": 0,
                "unit": "metre",
                "id": {
                    "authority": "EPSG",
                    "code": 8806
                }
            },
            {
                "name": "False northing",
                "value": 0,
                "unit": "metre",
                "id": {
                    "authority": "EPSG",
                    "code": 8807
                }
            }
        ]
    },
    "coordinate_system": {
        "subtype": "Cartesian",
        "axis": [
            {
                "name": "Easting",
                "abbreviation": "X",
                "direction": "east",
                "unit": "metre"
            },
            {
                "name": "Northing",
                "abbreviation": "Y",
                "direction": "north",
                "unit": "metre"
            }
        ]
    },
    "scope": "Statistical analysis.",
    "area": "United States (USA) - onshore and offshore.",
    "bbox": {
        "south_latitude": 15.56,
        "west_longitude": 167.65,
        "north_latitude": 74.71,
        "east_longitude": -65.69
    },
    "id": {
        "authority": "EPSG",
        "code": 9311
    }
}

Pyproj.crs.CRS tests: Is exact same = True, equals = True
zebbecker commented 2 weeks ago

The issue may actually be related to how we read the fgb files in with geopandas. [Edit to clarify: how we read the fgb files when writing new code to do things with FEDS output.]

For context, geopandas uses either Fiona or Pyogrio as its read/write engine. Until geopandas v1.0.0, Fiona was used by default any time we call GeoDataFrame.to_file() or geopandas.read_file(). As of v1.0.0, geopandas now defaults to a different read/write engine, Pyogrio. In earlier versions, you can specify Pyogrio by passing engine='pyogrio' as an argument to either of those methods. Going forward, versions >= 1.0 will default to using Pyogrio without needing to specify.

Fiona does not read the CRS in correctly from our fgb files; Pyogrio does. I tested using geopandas v0.14.2 and v1.0.0, and it does not matter which geopandas version we use or which engine we specify for writing the fgbs in v3. It just matters which engine is used to load the files. As long as we read the fgb using Pyogrio, the resulting GeoDataFrame will have the correct CRS, and therefore will align with v2 perimeters.

So, going forward, anyone using geopandas >= 1.0.0 will not encounter this projection issue. Anyone still using an earlier version can read in fgb files like so: gpd.read_file(path, engine='pyogrio') as a workaround.

Example below:

Screenshot 2024-06-25 at 2 42 57 PM
ranchodeluxe commented 2 weeks ago

The issue may actually be related to how we read the fgb files in with geopandas.

Amazing work @zebbecker ✨ Feel free to put in a PR off primarykeyv2 and upgrade the correct spots in the code. If you want to pair on that let me know and I can set something up quick for tomorrow

zebbecker commented 2 weeks ago

@ranchodeluxe a quick chat tomorrow would be great! Because the issue occurs when reading our fgb outputs, I don't think we can address it in our codebase directly. As far as I can tell, the files are being written correctly. The workaround above is something that users would need in any code they write that tries to load our products, until they upgrade to geopandas>=1.0.0. So, would like your insight on how to move forward, since its definitely an issue that will pop up for people but might not be something we can fix with a PR in this project.

ranchodeluxe commented 2 weeks ago

@ranchodeluxe a quick chat tomorrow would be great!

What I was after is that maybe we should upgrade to benefit all places we read from (our notebooks mostly). I see we don't use a pinned version of geopandas so that means we'll be receiving the benefits of 1.0.0.

Though however now I'm wondering if we're gonna see bugs b/c of any breaking changes 🤔 Maybe we should read about the "Backward Incompatible Changes" https://github.com/geopandas/geopandas/releases

ranchodeluxe commented 2 weeks ago

@ranchodeluxe a quick chat tomorrow would be great!

Though however now I'm wondering if we're gonna see bugs b/c of any breaking changes 🤔 Maybe we should read about the "Backward Incompatible Changes" https://github.com/geopandas/geopandas/releases

@jsignell and I chatted. Safe thing to do here is pin geopandas<1.0.0 and manually remember to set the engine on read in notebooks. We can upgrade after we get better test coverage in. That sound good @zebbecker

zebbecker commented 2 weeks ago

Going forward, recent outputs generated with either v2 or v3 can be read correctly using the Pyogrio engine, either by default when using geopandas>=1.0.0, or by specifying gpd.read_file(filename, engine='pyogrio') with geopandas<1.0.0.

However, I have encountered at least one historical v2 run where the fgb outputs are projected incorrectly when doing this. The issue was resolved by using Fiona to read in the files. I suspect that this has something to do with the respective versions of Fiona used to generate the original files vs read them in now, but have no way of knowing which version was used in the older run.

Essentially, this shouldn't be a problem going forward, but anyone who runs into trouble reading older fgbs with geopandas should try using both Pyogrio and Fiona as a troubleshooting step. Even if the CRS is read incorrectly (leading to the projection being shifted) no warnings are generated, so you will need to map your results and check manually which is correct.