Rob174 / detection_nappe_hydrocarbures_IMT_cefrem

0 stars 0 forks source link

Oil detection with neural networks

This work has been realized as part of my internship at the IMT / CEFREM. The state of the project at the end of it is available on the endInternship branch. To further improve it I have decided to continue to develop new code on the dev branch

Summary

1. Data

1.1 Network input data

1.2 Labels

1.3 Notes

2 Objectives

2.1 General objectives

2.2 Image patches classification

2.3 Image segmentation

3. Development environment

3.1 Python packages used

4. Réunions

1. Data

1.1 Network input data

The rasterio will allow the user to open raster files which consists of 3 files per image : .hdr, .img and .bsd :

It is the image written in a binary format.

It is the header containing the metadata of the image : (documentation here)

Raw images .tiff has been opened with snap (VV (vertical polarisation) channel) preprocessed by the snap software with several modules:

pipeline used at the beginning of the internship (cf code of the folder to see effectively which preprocessing steps are used): image

We have(will for the moment) reproduced it as a python pipeline in order to change the preprocessing, remove one step to see the influence on the final precision of the AI.

1.2 Labels

There are 4 000 images manually segmented

To highlight which regions of the image belong to which category, CEFREM members have drawn polygons on the image with qgis and indicated at which category pixels inside them belong. We have access to their vertices with pixel coordinates or gps coordinates with the .shp file. We can find the caategory of each polygon in the .dbf file.

These files can be opened in python with the geopandas package

So, it is necessary to determine with python to which category belong which pixels thanks to the polygons.

In the future we may use another shapefile to distinguish the land from the sea. It might be possible to annotate the boats by using data from the marinetraffic website. (But there are API quotas....) or to use AIS data.

In the future, it will be possible to calculate the wind field of the raster image thanks to snap (cf Radar > SAR Applications > Ocean Applications > Wind Field Estimation) It will be possible to apply a speckle operation before in order to smooth the ouput or a moving average. We can also later filter small objects...

We will be able to add land labels thanks to premade maps. We just need to adjust them (put a 2000-5000 meters margin). It has been done directly in QGIS.

1.3 Notes

2 Objectives

2.1 General objectives

There are 3 categories (classes) possible (at the initial state of the project)

2.2 Image patches classification

We take the image and split it into smaller regions (named patches). The network then has to tell which class is present on this image. As several classes may be on the same patch, we can predict the probability that each class is on the image. Thus, we will have the following output vector :

It will be necessary to determine what method to use to create the patches. **#TODO** ### 2.3 Image segmentation This time, we take the image and we want that the network outputs the category of each pixel. To reach this goal we will start with a premade network, available at [this link] (https://github.com/bonlime/keras-deeplab-v3-plus) ## 3. Development environment ### 3.1 Python packages used Python 3.7 : mandatory for windows users : allows to use rasterio |Package|Utilisation| |:---:|:---:| |Pytorch (torch)|Neural network| |rasterio and GDAL|To read raster files (cf [#3](https://github.com/Rob174/detection_nappe_hydrocarbures_inria_cefrem/issues/3) to install)| |geopandas|To read shapefiles (cf [#5](https://github.com/Rob174/detection_nappe_hydrocarbures_inria_cefrem/issues/5))| |dbf|To open the annotation files *.dbf | |pillow|To draw the polygon on the image| |rich|For better logs| ### 3.2 Other sotwares - QGIS: to visualize polygons and rasters - snap: to preprocess the images ## 4. Réunions - jeudi réunion zoom - mardi 01/06 au CEFREM Liste de questions : 1. Script sample pour ouvrir les rasters -> cf stackoverflow 2. Sample de d'image .hdr pour savoir à quoi correspond chaque label -> ok 3. Confirmation : annotation = polygones qui permettront à terme de déterminer pour chaque pixel de l'image si il apprtient à telle ou telle catégorie -> ok 4. Quels sont les catégories possibles (les 2 types de rejets de pétroles notamment) ? -> nappes d'hydrocarbures (seep (naturelles) spill (artificielles)), ou pas 5. Quels fichiers contiennent les annotations ? -> shp et éventuellement shx mais à voir ce qu'il faut à python 6. Pistes sur comment les ouvrir ? 7. Est-ce que 1 zone d'1 image peut avoir plusieurs annotations différentes ? -> non 8. Parlé de fichiers annotation corrompus : est-ce que c'est bon maintenant et quels fichiers faut-il prendre ? -> tt bon ; De l'ordre du To de données 9. Confirmation du planning : 1. Classification de patchs ; 2. Segmentation d'images complètes -> ok Question transversale à répondre : quelle est l'utilité de relu et des fonctions d'activation ? ### jeudi 03/06 DONE: - script d'ouverture des raster - récupération de la résolution en m/px - récupération des annotations et constitution d'une image avec les annotation superposées avec l'image originale Questions: - est-ce qu'il serait envisageable que l'on commence le transfert aujourd'hui --> ok - cela me permettra de voir l'organisation des fichiers --> ok ### jeudi 03/06 (2) Questions - Identifiant github de Mr Risser pr ajout au repo Abordé - question du découpage en patchs sans coupé les nappes d'hydrocarbures. - réalisation de statistiques nécessaire - ou tests avec l'ia ### mardi 08/06: Questions - Taille de grille ok ? [#11](https://github.com/Rob174/detection_nappe_hydrocarbures_inria_cefrem/issues/11) cf issue - Avis solution de visualisation des résultats ? notebook + simple: pas trop compliquer ### mardi 15/06 14h: - erreur de calibration ? [#22](https://github.com/Rob174/detection_nappe_hydrocarbures_inria_cefrem/issues/22) - confirmation: algorithme pour vérifier la validité à appliquer sur le shapefile global ? - formule categorical crossentropy (cf LossFactory) ### mercredi 23/06 10h40 - besoin de répondre à Mr Gondet ? - résultats à l'heure actuelle ### Mardi 20/07 14h - besoin de maintenir la version step by step de l'algo ?