Esri / raster-functions

A curated set of lightweight but powerful tools for on-the-fly image processing and raster analysis in ArcGIS.
Apache License 2.0
196 stars 81 forks source link

Issue with XGBoost loading a pretrained model in a custom raster function #84

Open sarahwegmuellerUSFS opened 1 year ago

sarahwegmuellerUSFS commented 1 year ago

Hello,

I'm looking for help trouble shooting a custom raster function. The heart of the rf is an XGBoost model, but something goes wrong when I try to load the model. I've used Pickel to help narrow down the issue, and verified that the xgboost library IS loading and that the filepath for the model is correct. For whatever reason, the model is not loading. Using a notebook in ArcPro, I could not replicate the issue (everything runs as it should). It seems to be an issue specifically with the raster function. I'm really at a loss as to how to fix this and any insight would be much appreciated!

Other notes: Everything up to the loading of the model is working (verified this using Pickel). The rf on a whole runs, but provides no output (because the model is not loading).

The code around the issue:

    def updatePixels(self, tlc, shape, props, **pixelBlocks):
        # Preprocessing: Create mask(s)(vector and raster) as needed

        # Get bands
        image_array = np.array(pixelBlocks['NAIP_ras_pixels'], 'float32')
        red = image_array[0,:,:]
        green = image_array[1,:,:]
        blue = image_array[2,:,:]
        nir = image_array[3,:,:]

        def calc_rdvi(nir_values, red_values):
            nom = nir_values - red_values
            denom = np.sqrt(nir_values + red_values)
            return (nom/denom) 

        def calc_osavi(nir_values, red_values):
            nom = nir_values - red_values
            denom = nir_values + red_values + 0.16
            return (nom/denom)

        def calc_evi(nir_values, red_values, blue_values, G=2.5, C1=6, C2=7.5, L=1):
            nom = nir_values + red_values
            denom = nir_values + (C1 * red_values) - (C2 * (blue_values + L))
            evi = G*(nom/denom)
            evi = np.nan_to_num(evi, posinf=0, neginf=0)
            return evi

        def calc_vari(green_values, red_values, blue_values):
            return (green_values - red_values)/(green_values + red_values - blue_values) 

        def calc_msavi(nir_values, red_values):
            term1a = np.square((2*nir_values) + 1)
            term1b = 8*(nir_values-red_values)
            term1 = np.sqrt(term1a-term1b)
            term2a = (2*nir_values)+1
            term2 = term2a - term1
            return term2/2

        def calc_gli(green_values, red_values, blue_values):
            nom = (green_values - red_values) + (green_values - blue_values)
            denom = (2*green_values) + red_values + blue_values
            return (nom/denom)

        rdvi = calc_rdvi(nir, red)
        osavi = calc_osavi(nir, red)
        evi = calc_evi(nir, red, blue)
        vari = calc_vari(green, red, blue)
        msavi = calc_msavi(nir, red)
        gli = calc_gli(green, red, blue)

        array_shape = rdvi.shape

        vi_array = np.array([rdvi.flatten(), osavi.flatten(), evi.flatten(), vari.flatten(), msavi.flatten(), gli.flatten()])

        del (red, green, blue, nir, rdvi, osavi, evi, vari, msavi, gli)

        # Fix any nan or inf that result from VI calculation (problems in nodata areas)
        vi_array = np.nan_to_num(vi_array, posinf=0, neginf=0)

        # Step 2: Create the dataframe
        def generate_vi_df(two_dim_vi_stack):
            # create a DF to match XGBoost model
            df = pd.DataFrame(columns=['ID','RDVI','OSAVI',
                                      'EVI','VARI','MSAVI','GLI',
                                      ], dtype='float32')

            df['ID'] = range(0, two_dim_vi_stack.shape[1])
            df['RDVI'] = two_dim_vi_stack[0,:]
            df['OSAVI'] = two_dim_vi_stack[1,:]
            df['EVI'] = two_dim_vi_stack[2,:]
            df['VARI'] = two_dim_vi_stack[3,:]
            df['MSAVI'] = two_dim_vi_stack[4,:]
            df['GLI'] = two_dim_vi_stack[5,:]
            return df

        df = generate_vi_df(vi_array)

        # Step 3: Load the XGBoost model

        # Able to output in a Pickel a dict with configs (xgb.get_config()). So XGBoost *is* being imported. 22 Nov 2023.
        # test = xgb.get_config()

        #(The JSON will load independently, so the path is correct. 22 Nov 2023.)
        model_fn = 'TreeCAP_Full_US_NoOrgSpecBandsOREco_RDVI_OSAVI_EVI_VARI_MSAVI_GLI.json'
        model_filename = os.path.join(xgb_model_directory, model_fn)

        ## Unclear if model is actually loading. 22 Nov 2023
        bst = xgb.Booster(model_file=model_filename)  

        ## This should provide a dict output (this works in and Arcpro notebook) to test if model is loading,
        ## but Pickle fails to save output. No file is created at all. 22 Nov 2023
        output = bst.get_fscore()

        ########## save a pickle ##########
        fname = 'pixelBlocks.p'
        pickle_filename = os.path.join(debug_logs_directory, fname)
        pickle.dump(output, open(pickle_filename,"wb"))
        ########## pickle saved ##########
DeniseGIS commented 10 months ago

Federal NORUS customer submitted Secure Support Case 03497602 to report an issue with a custom raster function that includes importing a pretrained xgboost machine-learning model that won't import for them.

The user has written a custom raster function that provides and empty output. When the user runs the same script as a Notebook, the script runs as we would expect and provides and output. However, this only seems to not work when ran as a raster function. User has ran the script through Pickle and it seems that the script fails when importing the xgboost model. The primary source that the user used to create the raster function is from GitHub: https://github.com/Esri/raster-functions/wiki/PythonRasterFunction#anatomy-of-a-python-raster-function [github.com].

Support troubleshooting discovered the function was obtained from https://github.com/Esri/raster-functions/wiki/PythonRasterFunction#anatomy-of-a-python-raster-function and directed user to submit the issue here https://github.com/Esri/raster-functions and then discovered customer already logged an issue, https://github.com/Esri/raster-functions/issues/84

Support has reproduced this issue yet GitHub licensing section clearly states "Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License." Support cannot assist the customer further. Support analyst: Taylor Jones,

gbrunner commented 7 months ago

@sarahwegmuellerUSFS Did you ever get this working?

sarahwegmuellerUSFS commented 7 months ago

Hello Greg,

No, I did not. Sorry for the delay. I was in the field when this email came in and I'm still digging myself out of the mound of emails that came in during the week I was away.

Thank you, Sarah

[Forest Service Shield] Sarah Wegmueller, PhD (she/her) Remote Sensing Analyst St Paul Field Office Forest Service State, Private, & Tribal Forestry Forest Health Protection

Office: (651) 649-5029 Cell: (612) 963-0330 @.**@.> 1992 Folwell Avenue St Paul, MN 55108 [http://wwwstatic.fs.usda.gov/images/email/usda-logo.png]http://usda.gov/[Twitter Logo]https://twitter.com/forestservice[Facebook Logo]http://facebook.com/USDA Caring for the land and serving people

From: Gregory Brunner @.> Sent: Monday, April 15, 2024 9:33 AM To: Esri/raster-functions @.> Cc: Wegmueller, Sarah - FS, MN @.>; Mention @.> Subject: Re: [Esri/raster-functions] Issue with XGBoost loading a pretrained model in a custom raster function (Issue #84)

@sarahwegmuellerUSFShttps://github.com/sarahwegmuellerUSFS Did you ever get this working?

- Reply to this email directly, view it on GitHubhttps://github.com/Esri/raster-functions/issues/84#issuecomment-2057009652, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BEF7TOJIUPYQMYVHEMUCOF3Y5PQLNAVCNFSM6AAAAAA7WV6ENGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJXGAYDSNRVGI. You are receiving this because you were mentioned.Message ID: @.**@.>>

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.