jonathanventura / canopy

Automatic tree species classification from remote sensing data
MIT License
36 stars 10 forks source link

Extraction of images #2

Closed TOUAOUSSA-Oussama closed 3 years ago

TOUAOUSSA-Oussama commented 3 years ago

hello, thank you for sharing your code and the database used. However, there is a problem in extracting images of size 15x15 from the overall image. Indeed, when you extract an image once you have found a labeled pixel, it allows to have images shifted by only one pixel, and when you divide between test data and other training data, the model predicts an image where we can find this same image in the training base except that it is shifted by only one pixel and therefore the classification results are not good.

jonathanventura commented 3 years ago

Hi, are you referring to this line?

    image_patch = image.read(window=((row-patch_radius,row+patch_radius+1),(col-patch_radius,col+patch_radius+1)))

Or, if not, could you point out in the code which part you are referring to?

TOUAOUSSA-Oussama commented 3 years ago

Hi, thank you for replying, no here :

get all labeled locations in the labels raster

rows, cols = np.where(labels_raster!=label_ndv)
TOUAOUSSA-Oussama commented 3 years ago

in fact, in the article you have just 713 trees labeled but with you're extraction you got too many

jonathanventura commented 3 years ago

I don't see the issue with the np.where() call. It will return the rows and columns of locations where the labels raster has a valid value.

Re: 713 trees -- here we are extracting a patch for each pixel inside a labeled tree. So there will be more than 713 patches extracted.

TOUAOUSSA-Oussama commented 3 years ago

Yes that true, you extract a patch for each pixel but the problem is when you split your data into training and test, you will have an image on test data that exists on training data except that it is shifted by only one pixel. So the problem, you predict an image that exists in training data

TOUAOUSSA-Oussama commented 3 years ago

and this will affect your metrics

jonathanventura commented 3 years ago

Where in the code do you see this one-pixel shift happening?

TOUAOUSSA-Oussama commented 3 years ago

in "extract.py", you will notice that if you visualize the images extracted

jonathanventura commented 3 years ago

When we split the data, we split by trees, not by patches. So we will not have patches from the same tree in both the train split and the test split. This is done in preprocess.py.