WISE 6 vs 7 Rounding Validation of WISE

spydmobile commented 1 year ago

While version 6 inherited formal validation from Prometheus, Version 7 did not. SO this ticket will outline how we will achieve formal validation of WISE in the most expedient and efficient manner. The current thinking is that a since the real changes here in version 7 are to the mathematical rounding used , that a statistical comparison of firecast output polygons from 2022 AB/NT/YK Firecast Project (version 6 and version 7) will be enough for a formal Validation. Eg. If there is no statistical differences in outputs, we are validated. This can easily be done via a automation script leveraging the Firecast API to retrieve polygons. JS or Python would be quick and easy for @spydmobile or @BadgerOnABike to do respectively, but this issue is to capture the other ideas.

spydmobile commented 1 year ago

@RobBryce after our discussion last week, do you have what you need to move forward on planning this?

spydmobile commented 1 year ago

@RobBryce has what he needs. @nealmcloughlin @BadgerOnABike have indicated they are not worried about testing the auxiliary grids so rigourously. Formal validation should be completed for 1.0

spydmobile commented 1 year ago

There is an open ticket which should be a milestone: #25

spydmobile commented 1 year ago

@RobBryce I dont see this on your radar, BUT you and I tagged it for 1.0

spydmobile commented 1 year ago

@RobBryce - the nature of this task has changed. @BadgerOnABike will be guiding you through the next steps.

BadgerOnABike commented 1 year ago

I'll outline how we intend to do our test, and the data requirements therein. The intent is that I will carry out the assessment but I will require data from the previous years operations.

This leads to my first question for you, is it possible for me to get access to the database you're storing these fires on to collect the samples I'll need for this test?

Testing Method

Perform the perimeter comparison between 6 and 7 based on the bias, hit, miss and critical success index. Same metrics I used in my thesis work.

Testing data requirements

90 total samples from each of AB, NWT and BC with 30 at starting FWI <19, 30 at FWI 19 - 25, and 30 at FWI > 25 for each V6 and V7.

This will yield 270 samples per version with good geographic and weather spread.

This will allow us to assess the differences in area burned by version and will allow us to quantify the scale of difference between these two products based on rounding.

@nealmcloughlin Have I captured our conversation?

BadgerOnABike commented 1 year ago

Close to a full pull for statistics from the V6 fires. This will allow for the pull of the perimeters I care about.

Workplan:

[x] Functional analytical pull from V6
[ ] Subset data
[ ] Pull relevant perimeter data
[ ] Functional analytical pull from V7
[ ] Subset
[ ] Perimeter data
[ ] Comparison

BadgerOnABike commented 1 year ago

Within my functional pull I'm running into some issues.

1) Inconsistent data at the endpoint - sometimes the summary file is within the deterministic details, other times its at the top level of the fire details under 'det_name', this has resulted in a lot of try-catch to move through all the information.

2) A lot of 404 and 500 returns (I'm looking through the 404s to see if that's a failure on my part, but these calls are all done in a loop, so if there is a consistency of address issue its a bigger issue), the 500's are more interesting as the body exists but there is no information.

Takes around 2 hours to cycle through the 2000 fires. Due to my try catching I get 130 going into my "dead_fire" bin with 404 and 500 responses and 19 returns of fires with data. I am pulling the name, time and FWI.

1958 returns - 19 with data - 1781 with response of 404 or 500

@RobBryce @tredpath I checked through my dead list and tested a few of the fires individually and was not able to get any returns.

My collection code is as follows:

Edit - per usual, you post your code for people smarter than you, then define a better way as you re-read your work.

import requests
from requests.auth import HTTPBasicAuth
import json
import pandas as pd

url = "https://api1.firecast.ca/api/v1/fire/mine"

headers = {
    'user-agent': "vscode-restclient",
    'authorization': "Bearer #################################",
    'content-type': "application/json"
}

providers = {"NT":"f740e5b6-2077-4d3a-8b48-3f9fcd169f45", "SK":"6030be5e-1bd9-4279-aff0-b0ebb2e12ae7", "YT":"937d5469-733b-4e67-843c-3e57bbf5c340", "AB":"a4ff359e-7400-4a64-834a-3c5ab3d0f235" }

fires = []

for k in providers:

    querystring = {"provider": providers[k], "since":"2022-01-01T12:00:00Z"}
    response = requests.request("GET", url, headers=headers, params=querystring).json()

    for i in response['features']:
        fires.append(i["properties"]["firecast_fire_number"])

details = []
out = []
keys = ['date_time','hfwi','elapsed_time']
dead_fires = []

for i in fires:
    querystring = {"name":i}
    url = "https://api1.firecast.ca/api/v3/fire/details"
    dat = requests.get(url = url, headers = headers, params = querystring).json()

## The initial data form
    if 'deterministic_details' in dat and len(dat['deterministic_details']) > 0:

        ## Lets name this fire

        url2 = "https://api1.firecast.ca/api/v1/jobs/stats/download"
        qs2 = {"id": dat['deterministic_details'][0]['summary_file']}

        # Catch a dead link/non-existent fire
        if requests.get(url = url2, headers = headers, params = qs2).status_code != 200:
            dead_fires.append([i,"Response of " + str(requests.get(url = url2, headers = headers, params = qs2).status_code)])

        if requests.get(url = url2, headers = headers, params = qs2).status_code == 200:

            summary = requests.get(url = url2, headers = headers, params = qs2).json()
            out.append([i,[summary[0]['stats'][0].get(key) for key in keys],[summary[0]['stats'][-1:][0].get(key) for key in keys]])

## A second data form
    if 'det_name' in dat and len(dat['det_name']) > 0:

        ## Lets name this fire

        url2 = "https://api1.firecast.ca/api/v1/jobs/stats/download"
        qs2 = {"id": dat['summary_file']}

        # Catch a dead link/non-existent fire
        if requests.get(url = url2, headers = headers, params = qs2).status_code != 200:
            dead_fires.append([i,"Response of " + str(requests.get(url = url2, headers = headers, params = qs2).status_code)])

        if requests.get(url = url2, headers = headers, params = qs2).status_code == 200:

            summary = requests.get(url = url2, headers = headers, params = qs2).json()
            out.append([i,[summary[0]['stats'][0].get(key) for key in keys],[summary[0]['stats'][-1:][0].get(key) for key in keys]])

BadgerOnABike commented 1 year ago

Turns out, to collect the stats file, using the summary file isn't helpful. However this hasn't solved everything.

WISE-Developers / Project_issues

WISE 6 vs 7 Rounding Validation of WISE #126

Testing Method

Testing data requirements