AI4EO / tum-planet-radearth-ai4food-challenge

Starter pack for the AI4Food challenge organised by TUM/DLR, Planet and Radiant Earth
34 stars 21 forks source link

[Question] Is South Africa planet 5 day fusion data not available for labels in 34S_19E_259N? #7

Open ivanzvonkov opened 2 years ago

ivanzvonkov commented 2 years ago

I noticed that the tifs in ref_fusion_competition_south_africa_train_source_planet_5day have a bounding box that does not cover the labels in 34S_19E_259N. Is that data not provided as part of the competition?

image

ridvansalihkuzu commented 2 years ago

Dear @ivanzvonkov

Thank you for reporting the issue.

In our investigation, the notebook run without error on the tile grid: 34S_19E_259N. To make further investigation, may I ask some couple of questions:

  1. In the error line in your run, the exception happens in line 145 of PlanetReader, but in the original repository, there is no such code in line 145. Have you got this error after modifying PlanetReader or before? a. If you got this error before modifying the PlanetReader, can you provide the field ID of the corresponding label causing the error in your run? b. If it happens after your modifications, can you please check that there is no bug injection in earlier lines?

  2. Another possibility, there might be an interruption while PlanetReader was being initialized. You can clean the cache folder ref_fusion_competition_south_africa_train_source_planet_5daytime_series and rerun it again to see if the error still happens.

  3. In our investigation, we have downloaded the planet data from RadiantEarth, can you please let us know from where you downloaded data? So we can re-control all data sources to be sure about if we placed data correctly everywhere.

ivanzvonkov commented 2 years ago

I think I found the bug here: https://github.com/AI4EO/tum-planet-radearth-ai4food-challenge/blob/83897c85a0b5e5a92fccfbbe230d64aafe16ceca/notebook/utils/planet_reader.py#L104

inputs = glob.glob(input_dir + '/*/*.tif', recursive=True)
tifs = sorted(inputs)
labels = gpd.read_file(label_dir)

# read coordinate system of tifs and project labels to the same coordinate reference system (crs)
with rio.open(tifs[0]) as image:
    crs = image.crs
    print('INFO: Coordinate system of the data is: {}'.format(crs))
    transform = image.transform

All the planet data is in a single folder (as downloaded from RadiantEarth). So tifs[0] is data/ref_fusion_competition_south_africa_train_source_planet_5day/ref_fusion_competition_south_africa_train_source_planet_5day_34S_19E_258N_2017_04_03/sr.tif regardless if the labels are being passed for 258 or 259.

As a result, transform is always derived from that first tif which results in an incorrect window being set.

My solution was to separate out all the 259 tifs into a separate folder and use the new folder path as the input_dir.

ridvansalihkuzu commented 2 years ago

Dear @ivanzvonkov,

Thank you for the very helpful feedback. We will insert a flag to check the origin of the tiff image before the transformation, and fix the bug soon.

rym-oualha commented 2 years ago

image I came accross another error while extracting the 259 tiles for planet fusion and using your solution worked for me Thanks @ivanzvonkov