Open adebowaledaniel opened 1 year ago
@cnakalembe, can you confirm the start month (February or September)?
@ivanzvonkov, there are missing predictions in the Zambia cropland mask see here; despite running the inference multiple times which gave the same number of predictions made see the screenshot below.
I think this is most likely due to the nan values in some tifs, you can go ahead and merge. The logs can be investigated to find out why this is happening
I already merged; see the link I provided above (https://code.earthengine.google.com/a9522bd391a18cd98268994b6bffe317?hideCode=true)
The error messages vary; one is a request timeout, and the other is not specific.
Some of the errors I am seeing look like this
This means there's a nan value in the tif. This was fixed in a newer version of OpenMapFlow (https://github.com/nasaharvest/openmapflow/pull/109)
It'll be deployed if you install it manually before deployment
pip install openmapflow==0.2.1rc2
export OPENMAPFLOW_MODELS="..."
openmapflow deploy
New version can now be deployed if it's on master here by using the Github action manually: https://github.com/nasaharvest/crop-mask/actions/workflows/deploy.yml
Blocked by #230
Above is merged can retrain with missing values in training data
Here is the error I got while trying to retrain the model @ivanzvonkov.
Nit: I think this warning might be of concern later.
@adebowaledaniel re: arrow error it happens because of this line https://github.com/nasaharvest/crop-mask/blob/7f5f809149d49aea9458e13d5ee88d3ad3b3484b/src/models/data.py#L95
Do you see the bug? 🐛
Consider creating small map and checking visually
Still the same problem @ivanzvonkov
Map quality is poor due to large and small scale blockiness, blatantly wrong predictions
@adebowaledaniel to investigate a few things to debug:
The three error codes in the logs:
Potential solution: Increasing the memory limit and reducing the request per container.
Ivan to send Zambia data if he has it Adebowale to create PR for error analysis notebook update
From the error analysis, lots of the error is due to shrubs, points with lots of vegetation and fallow field predicted as cropland. Also, a point on a rooftop is predicted as a cropland pixel; checking the prediction map, I noticed this pattern which predominantly affects almost the entire city of Lusaka (an urban settlement) to be predicted as cropland.
NDVI vs other indices exploration, potentially training on higher quality data
Training on higher-quality data; there is no improvement in the new model @hannah-rae https://github.com/nasaharvest/crop-mask/blob/853344e17a9e16054a9531840e9752b9b4a1ca00/data/models.json#L310-L312 previous model: https://github.com/nasaharvest/crop-mask/blob/9171bd68b8d86c7acfb2c245b864539b419639fd/data/models.json#L242-L244
Next step for week of April 17 @adebowaledaniel : evaluate quality of CEO Zambia data ([using rubric])(https://docs.google.com/spreadsheets/d/1BYOrMkmryjngGApKIYZ0YXzJkFzyc2DM3hydYVT_vh0/edit#gid=0)
Next step for week of May 22 @adebowaledaniel : apply post-classification NDVI filtering (using method designed by @bhyeh ) and evaluate error rates and types
During the operational meeting @bhyeh mentioned that @adebowaledaniel noted that since the Zambia_CEO_2019 dataset had a 0/0.5/0.5 train/val/test split the model may not have been trained with any points from Zambia. We usually set the CEO datasets with this ratio when we assume/know there are local samples in the other datasets (e.g. a ground-based dataset independent from the CEO dataset), but in the case of Zambia it's very possible there were little to no points in the other datasets. (One could check in GeoWiki how many are in Zambia, but this is probably the only dataset that has Zambia points). @adebowaledaniel can you post your updated results/plan based on your 0.6/0.2/0.2 split here?
Thank you, @hannah-rae. As you mentioned, the Geowiki is the only dataset with Zambia data with a training subset with 336 sample points (positive class: 5.6%). Here is the result for the split to 0.6/0.2/0.2. https://github.com/nasaharvest/crop-mask/blob/1785602602d1260edb53a13a608e9ee84c5d6f8d/data/models.json#L325-L341
I applied the post-classification NDVI filtering method by Ben on a subset produced by the model; the output is here.
June 5 - Check for cloud presence in the tif files
@hannah-rae, Contrary to our expectations of cloud presence, the Sentinel-1 bands were absent in those oddly-shaped regions on the map. I shared my observations in this slide and also included a notebook (link in the slide) in case you want to reproduce what I did.
Very interesting... was that not captured in the logs at all? Maybe we should add a test when the data are exported to check that none of the bands are missing data.
For now, perhaps it makes sense to train a new model without S1?
Here are the outputs of the new model trained without S1: map(as expected, it's without the weird features) and metrics. Let me know your observation of the subset map generated. Should we continue with this model for the entire country?
I will create an issue regarding the missing S1 bands; also, check the eo export log for any clues.
@hannah-rae Crop Mask + Postclassification processing: here
@adebowaledaniel can you make the assets public?
Done @hannah-rae
Loading is crashing for me. @cnakalembe to try loading and will do expert sign-off
I reviewed the map; I think the next step is manual cleanup removing obvious features like roads, I've seen some mines too. We could develop some clear guidance for this and I think Diana can do it in QGIS/ArcGIS
@hannah-rae will make GEE script in repo to export ensemble map for Zambia (and other future countries)
update: should be addressed by notebook/GEE app created by @ivanzvonkov in #315
@ivanzvonkov will make this map and update intercomparison re #346
After running intercomparison on Zambia with full evaluation set (validation and test), ensemble ties the glad map.
There are also not that many points to begin with because many of them were sampled outside of Zambia boundaries (old CEO project). Should we proceed with just exporting GLAD map? @hannah-rae
Next step for @cnakalembe to check if the GLAD map looks ok and is ok for use case, or if there is some reason to export the ensemble map instead.
GLAD map is okay for the use case!
Next step: @ivanzvonkov run the export code for GLAD map
Shared exported map on slack
Start year: 2019 Start month: November