Earth-Information-System / fireatlas

3 stars 1 forks source link

Discrepancies In Largefire Perimeter Generation #59

Open mccabete opened 1 month ago

mccabete commented 1 month ago

I was demo-ing the V3 code and comparing it to perimeters from V2 and perimeters from V2 that I had post-hoc corrected for merging issues. I noticed that there were some differences, specifically around little fires at the western edge of the fire. Some of the small fire starts were missing from V3. Some of them are

Version 2 (With merge issue): image image (5)

Version 3 (Generated from the notebooks): image (6)

image (7)

@jsignell and @ychenzgithub -- any thoughts on where this bug could be coming from?

zebbecker commented 1 month ago

Quick note- the screenshots from version 3 are colored by mergeid, not fireID.

ranchodeluxe commented 1 month ago

Some of the small fire starts were missing from V3

Can someone unpack this a little more for me please about what you're seeing between v2 and v3? Are we expecting those perimeter polygons to be merged? Or are we expecting that v2 has more perimeter polygons and we're not seeing them?

mccabete commented 1 month ago

We do expect them to be merged. In V2, they were merged in. In V3 they are not. Julia pointed out that we were plotting from the allfire object for the V3 stuff in the post above. Below is a screenshot from the largefire file where the little fires are missing.

image

mccabete commented 1 month ago

This suggests that V3 is making different decisions about how stuff gets merged. They might be better decisions :shrug: but I want to know where they are coming from.

jsignell commented 1 month ago

There are two places that merging happens: 1) While running Fire_Forward any fires that intersect with each other get assigned matching mergeids and the larger fire gets its hull updated to include the smaller fire. This happens in Fire_merge_rtree. 2) In the postprocessing steps fires that have the same mergeid are combined with each other. The bulk of that is in postprocessing.merge_rows.

jsignell commented 1 month ago

So since you are having different mergeids the difference is coming from within 1.

mccabete commented 1 month ago

So since you are having different mergeids the difference is coming from within 1.

I almost understand what this means, but still don't quite. Are you saying that the issue is how ID 1 got assigned?

jsignell commented 1 month ago

Sorry it was in reference to the prior comment. The issue that you are seeing is from within Fire_merge_rtree

jsignell commented 1 month ago

Is this the fire you are looking at:

tst = [2023, 6, 1, 'AM']
ted = [2023, 8, 12, 'PM']
bbox = [-76.510259,  52.624922, -73.801554,  53.743852]
mccabete commented 1 month ago

Not exactly that, but approximately. It's based on the bounding box of the original V2 fire.

mccabete commented 1 month ago

tst = [2023, 6, 1, 'AM']
ted = [2023, 8, 13, 'AM']
region = ('Fire_8495_Quebec_v2_test',[-76.510259,52.624922,
                       -73.801554,53.743852])
zebbecker commented 1 month ago

The v3 data those screenshots are from is available in my shared bucket if you want to take a peek without waiting to re-generate them, as that did take an hour or so: shared-buckets/zbecker/Fire_8495_Quebec. It was generated following the steps in notebooks 01 to 03.

ranchodeluxe commented 1 month ago

If possible, I think we should get @zebbecker, @jsignell and myself (@mccabete if she isn't too busy) together for an hour pairing session today or tomorrow. Then we can come up with a way to debug/answer these questions for future us. Thoughts? I think that would be fun

zebbecker commented 2 weeks ago

Update: the v3 perimeters we were comparing against were generated with two incorrect settings flags: CONT_OPT and FTYPE_OPT were set as "CA", they need to be "global" for the boreal region. PR 65 fixes a typo to allow this to work as intended.

With these two flags set correctly (e.g. the same way they were set in the v2 run we are comparing against), v3 merges the same way as v2.

jsignell commented 2 weeks ago

We can leave this open with the intention that the notebooks should be augmented to include more information about settings and how jobs are configured.

zebbecker commented 1 week ago

With the flag issue resolved, v2 and v3 perimeters look very, very close. However, there is still one small discrepancy that I have not been able to track down a source for yet. In this screenshot, you can see that there are a few places where the blue v3 perimeters peek out from below the yellow v2 perimeters. Look near the center of the image.

Screenshot 2024-07-08 at 7 43 35 AM

@mccabete thought that this might be caused by differences in the underlying data or by sensitivity to the bounding box, so I tried to control for as much as possible, running v2 and v3 with the same flags, parameters, and datasets. (To run on the same data, I processed the tmp files that v2 saves during its preprocessing steps into the input format that v3 expects, then ran v3 on those rather than the usual input locations.) This did not change anything about how the outputs compare, so either I missed something to control for (possible!) or its not this.

I'm worried that it might have something to do with shapely versioning, as the v2 perimeters were generated in an environment created from the conus-dps branch env.yml file, while the v3 perimeters were generated using the environment defined in the primarykeyv2 branch. Respective Shapely versions were 1.8.4 and 2.0.1 for v2 and v3, so we did cross a major release. I am investigating this possibility further.

jsignell commented 1 week ago

Other things that could result in mild differences:

1) Which pixels are in which clusters originally is determined by the sklearn BallTree algorithm, so scikit-learn version could be a factor. 1) The shape of the hulls with more than 3 pixels is dependent on settings.valpha, so you would want to ensure that that is held constant. 2) The hulls are calculated using either scipy.spatial.ConvexHull or scipy.spatial.Delaunay depending on how many pixels they include, so scipy version could be a factor.

I guess my point is it is unlikely that you would ever get exactly the same results given the difference in library versions and prtentially in the underlying data. Overall the focus might be better spent on deciding an allowable difference in overlap and writing tests to ensure that the difference in overlap does not exceed a certain threshold. If there is an external source of fire perimeters that might be an even better way to test so that instead of testing v2 vs v3 we could test v2 vs some canonical perimeters, and v3 vs some canonical perimeters.

zebbecker commented 1 week ago

Thank you @jsignell!

As far as I can tell, settings.valpha is consistent across all runs.

Running v3 with the scipy and scikit-learn versions used in the v2 runs results in the same undesired behavior in the outputs.

That leaves Shapely version. I'm having a hard time confirming that this is or is not the issue, because Shapely made a few breaking changes when going from 1.x to 2.x. Our v2 relies on 1.x behavior, but v3 relies on 2.x behavior. Both in seemingly nontrivial ways. The upshot of that is that I can't just run v3 with an older Shapely to test this hypothesis, nor can I run v2 with a newer Shapely, without significant refactoring.

(The specific issue I'm running into is that multipart geometries are no longer sequences.)