Tune hyperparameters to improve model predictions

natesawant / tree-biomass

0 stars 0 forks source link

Tune hyperparameters to improve model predictions #6

Open sophiahe671 opened 1 month ago

sjkkolasa commented 1 month ago

Looking at DeepForest parameters such as n_estimators and max_layers: https://deep-forest.readthedocs.io/en/stable/parameters_tunning.html

sjkkolasa commented 1 month ago

Now playing with patch size: https://deepforest.readthedocs.io/en/latest/better.html

sjkkolasa commented 1 month ago

I can't find a good way to run a pretrained deepforest model with specified parameters. the predict_tile() function is giving me errors. I also can't figure out where to specify n_estimators.

sjkkolasa commented 1 month ago

When I run predict_tile, I get this error:

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

        To fix this issue, refer to the "Safe importing of main module"
        section in https://docs.python.org/3/library/multiprocessing.html

But it works for Alex and it seems to work a lot better than predict_image

sjkkolasa commented 1 month ago

I can edit nms_thresh on the deepforest.config yml file. This makes it so the algorithm will overlap tree prediction areas more.

On the left, nms_thresh = .5 On the right, nms_thresh = .05 (default)

I don't think editing nms_thresh will help for this image.

sjkkolasa commented 1 month ago

predict_tile() works on some of our computers and it is much better than predict_image()! We think this is because the .tif file is very detailed and very large.

With predict_image(), we couldn't even run the program on the .tif file, just the .png file. So this is much better.

The parameters we can edit now are:

patch size
patch overlap
iou threshold
sigma
and more probably

sjkkolasa commented 1 month ago

Here is the result with patch size 30. It took about 7 hours to run. All the 'trees' it found were a lot smaller than the actual trees. I think the patch size might be so small that it doesn't even cover one whole tree. Maybe we should start testing with patch size 100?

sjkkolasa commented 1 month ago

Patch size 100 also seems too small. Took 30 minutes to run.