Cropland: Namibia North 2020

ivanzvonkov commented 1 year ago

Start year: 2020 Start month: September

[x] Labeling project created (#138)
[x] Labeling completed
[x] Data added to repository (#178 )
[x] Model trained (#290)
[x] Generate stratified samples
[x] Set 1 Labeling: @MsPixels @hannah-rae @bhyeh Aninda @cnakalembe Isha Taryn Snehal
[x] Set 2 Labeling: @sbaber1 @Di-anaBF @Mirzyaaliii @mnthnx64
[x] Map made

ivanzvonkov commented 1 year ago

Low cropland amount, low metrics: https://github.com/nasaharvest/crop-mask/blob/0b9c305ae71994b8d938ee34a77a295565ff7435/data/models.json#L19

Could be improved by NDVI stratification

ivanzvonkov commented 1 year ago

Next step adding WFP data

ivanzvonkov commented 1 year ago

@MsPixels now that you have some corrective points, the next step is to add them to the code base! Here's an example of me adding the the Namibia WFP data https://github.com/nasaharvest/crop-mask/pull/227 The main change is to datasets.py which you will have to update too! More info here: https://github.com/nasaharvest/crop-mask#adding-new-labeled-data

ivanzvonkov commented 1 year ago

Namibia North map v2

Link

Major improvement over previous map, over predicts crop in some regions, probably ready for dissemination

What changed:

Added WFP data
Moved 20% of CEO data into training dataset (validation and test reduced from 50% -> 40% respectfully)

Test metrics:

"accuracy": 0.9685,
"f1_score": 0.125,
"precision_score": 0.0909,
"recall_score": 0.2,
"roc_auc_score": 0.8155

Compute Cost: Cloud Run: $109.84 Cloud Functions: $5.69 Total: $115.13

ivanzvonkov commented 1 year ago

The compute cost above actually also accounts for Zambia v2 so it is actually lower

MsPixels commented 1 year ago

Yeah, @ivanzvonkov, major improvement indeed. Just wondering why the f1 score is this low despite this result.

ivanzvonkov commented 1 year ago

Additional data has been added in #243, now new model can be trained

ivanzvonkov commented 1 year ago

Model being trained https://github.com/nasaharvest/crop-mask/pull/250

hannah-rae commented 1 year ago

From Christina in meeting 1/19:

1500 crop type points with crop type and field boundary for season starting early 2023, use start month September still (collected Dec/Jan 2022-2023). Blake and Abena are cleaning that now.
Goal: make in-season cropland map starting September 2022 through January 2023 which includes the 1500 field labels from above ( @MsPixels can create this?)
Will be doing 2 more rounds of data collection in February (early season) and April/May (harvest stage) which can be used to evaluate the in-season map
Re: making more representative samples using information that we have from FAO etc, Christina suggested also looking at sampling to match proportions of production historically in admin1 zones (e.g. so we sample more from zones producing more); there is also historical information in Crop Monitors that we could consider using
@MsPixels can also look into making a map for WFP for cropland in 2021-2022 after the in-season map if time
Project will be ending officially in August but will hopefully be extended, can have our "final map" by that time.

MsPixels commented 1 year ago

Yes, @hannah-rae, I can create the in-season cropland map and hopefully the WFP map

MsPixels commented 1 year ago

Model being trained #250

Namibia_North_2020_v2

    "test_metrics": 
        "accuracy": 0.9324,
        "f1_score": 0.0625,
        "precision_score": 0.037,
        "recall_score": 0.2,
        "roc_auc_score": 0.8661

MsPixels commented 1 year ago

Comparing V1 and V2 - Namibia_North

Although this model predicts more crop fields than normal, it is able to correctly predict crop fields in the big yellow squares

hannah-rae commented 1 year ago

@MsPixels @ivanzvonkov I was looking at the last comments on this issue but we don't have the "next steps" recorded for when it got picked back up. Are these the potential next steps?

[x] Re-create map with new model trained w/ corrected NDVI ( @ivanzvonkov I think you are re-training all models?)
[x] Have the ground truth data from Dec/Jan 2022-2023 already been added to the test set?
[x] I have in an earlier comment that more ground data was to be collected in February 2023. @MsPixels have you heard from Christina or Blake if these new data are available?

Other things to try:

[x] Create stratified reference sample for validation/test instead of current random uniform
[ ] Apply post-processing NDVI that @bhyeh is working on (may reduce false positives)
[ ] Re-train model with high-quality data
[ ] Data quality audit

MsPixels commented 1 year ago

Thanks Hannah for your feedback. Yes, these are potential next steps.

The ground truth data from Dec/Jan 2022 - 2023 has been added to the test set
Yes, February 2023 is being cleaned
I'll go ahead and generate a stratified sample for validation.

MsPixels commented 1 year ago

Stratified random points

ivanzvonkov commented 1 year ago

Next steps:

CEO labeling project for stratified points ( specifically additional crop points)
Consider sampling crop points from field boundaries for validation set

MsPixels commented 10 months ago

Metrics for the model:

  "Namibia_North_V3": {
    "params": "https://wandb.ai/nasa-harvest/crop-mask/runs/qzfflhy8",
    "test_metrics": {
        "accuracy": 0.9635,
        "f1_score": 0.1429,
        "precision_score": 0.0833,
        "recall_score": 0.5,
        "roc_auc_score": 0.8404}, 

    "val_metrics": {
        "accuracy": 0.9744,
        "f1_score": 0.1053,
        "precision_score": 0.0556,
        "recall_score": 1.0,
        "roc_auc_score": 0.9834}

MsPixels commented 10 months ago

Namibia_North_2020_V3 Map

MsPixels commented 10 months ago

I also adjusted the threshold (greater than 0.8) here. I think it removes some noise

hannah-rae commented 10 months ago

@MsPixels initial evaluation: "I think the model does a good job at the west but over-predicts at the east, especially in the flood plains. Some built-up areas were captured as cropland. I'm not super confident about the South. Waiting for confirmation from Christina."

@MsPixels will try the post-classification filtering to see if that reduces some false positive areas (e.g. water/lake shorelines) and we will work with Christina to make final decision.

MsPixels commented 10 months ago

Post-classification filtering @hannah-rae.

MsPixels commented 10 months ago

Hello @hannah-rae, after further discussion with Christina and Blake, I generated an NDVI, water and built-up mask to remove most of the noise in the cropland map. The code is here

nasaharvest / crop-mask

Cropland: Namibia North 2020 #218

Namibia North map v2