allenai / satlas-super-resolution

Apache License 2.0
195 stars 24 forks source link

Using Satlas Super Resolution for NDVI Purposes #39

Closed lucalevi closed 1 month ago

lucalevi commented 1 month ago

Good day Piper! It's great to see such a good work being published and released to the public. We are a company in Italy focusing on GIS and also satellite imagery and found your work. We would like to super-resolve Sentinel-2, most likely L2A data, in order to upscale 10 m-resolution images in the bands B04 and B08 to a resolution of 5 m and 2.5 m.

1. Usage of L2A, B04 and B08 bands

I see here that you use Sentinel-2 L1C imagery instead of L2A imagery, which is better suited to calculate indices like the NDVI since L2A imagery provides reflectance values closer to the Earth's surface.

We are wondering whether you trained the model to account also for L2A imagery and whether the model weights you provide can also be used to super-resolve imagery in the bands B04 and B08 or whether you just focus on the TCI.jp2 files in the RGB channels. It would be also useful to know whether you have case studies in agriculture and/or you experimented with a super-resolved NDVI.

2. Resolution of the super-resolved images

Further, one thing is not clear to me: if I run inference on an image at 10 m resolution, which resolution am I going to get in the super-resolved image? [EDIT: "All models are trained to upsample by a factor of 4" from here, thus 10 m original -> 2.5 m super-resolved]

3. Adaptation of the .yml files to bands other than RGB

One more thing: I don't fully understand the infer_example.yml file: to run inference on a single image using the provided weights I have to adapt the sections of the infer_example.yml file with the informations of the .yml files mentioned next to the weight here. In such an instance, I have the following configuration file called "infer_ESRGAN_1S2.yml" taking the configuration, namely the network structure, written in the esrgan_baseline_1S2.yml file.

infer_ESRGAN_1S2.yml follows

# General Settings
name: infer_ESRGAN_1S2
model_type: SSRESRGANModel
scale: 4
num_gpu: auto
manual_seed: 0

# Inference Settings

# Root directory for low-res images you want super-resolved.
# This can be in any directory structures, but the code expects pngs.
data_dir: ssr/low_res_imgs/

n_lr_images: 1 %  change to match the esrgan model (the weight)

save_path: ssr/resolved_imgs/

# Structure of the generator you want to use for inference
network_g:
  type: SSR_RRDBNet
  num_in_ch: 3  % number of Sentinel2 images * 3 channels (RGB)
  num_out_ch: 3
  num_feat: 64
  num_block: 23
  num_grow_ch: 32

# Load pretrained weights into the generator that is defined above
path:
  pretrain_network_g: ssr/weights/esrgan_1S2.pth
  param_key_g: params_ema
  strict_load_g: true

Apparently, this is what is needed to run inference on a single image, but what if I want to do it on just a single channel, like for the bands B04 and B08 that are greyscale, and not the 3 RGB channels typical of the TCI.jp2 images? The line num_in_ch: 3 # number of Sentinel2 images * 3 channels (RGB) suggests that I have to use RGB images. Thus, I am wondering whether these yml files and more generally the inference using the pre-trained weights can be accomplished on single bands with single channels, namely the greyscale bands B04 and B08.

4. Shapes of the images, conversion from .jp2 to PNG

I also have another issue about the "shape" of the images, but I think that here I will format them in the appropriate way. I still have to experiment with it, but I don't know whether I will get PNG images. It would be useful to potentially upscale jp2, original Sentinel-2 images, without the need of reshaping or converting them. But I think this is an architectural feature of the model not easily modifiable (will be grateful though to hear your opinion on it.)

I have issues with the shapes of the images. It should be [32, 32, 3] but a, say, Sentinel-2 10 m, TCI image, even being a PNG, has shape [10980 , 10980, 3]. In my experiment I tried to convert the .jp2 images to PNG using gdal_translate -of PNG your_sentinel_scene.jp2 your_sentinel_scene.png , but I am wondering whether it is the right way to go. Will try with the code you provide, even if I don't see any PNG image being outputted from the code here. Further, where does the function _reprojectimage come from? I don't see it being defined.

with rasterio.open(tci_jp2_path) as src:
    img_rep, meta_rep = reproject_image(
        img, meta, rasterio.crs.CRS.from_epsg(3857), [...])

[EDIT: the function reproject_image is indeed not defined, and the provided image conversion cannot be accomplished. We seek guidance as about the right image conversion to feed the model.]

Thanks for your support, will be interested to hear from you with new perspectives :) Cheers from Italy!

piperwolters commented 1 month ago

Hi Luca, thank you for your interest in our work and the detailed post! Your company looks very interesting. And apologies for my late response, I was on vacation.

1) We train our model exclusively on L1C imagery. The reasoning behind this was largely that the Satlas models were trained on this, as we didn't want to remove atmospheric effects, and I used pretrained Satlas models as the backbone for some of the super-res models. Our current models are not trained to super-resolve bands other than TCI, but one could slightly alter the code to finetune a model to also predict specific bands. Due to this, we have not tried super-resolving NDVI, but that would be fascinating, and again, this code is well setup to train a model for that purpose.

2) Yes, our models generate imagery at 2.5m/px. We tried 8x and 16x upsampling to get imagery similar to full-resolution NAIP imagery, but found even more hallucination. 4x upsampling seemed to be a good balance of quality and accuracy.

3) Because all of our models only output super-resolved TCI, you would need to adjust the code. To run inference on single bands like B04 or B08, you would first need to train your own model (or finetune one of our models to this new use case). Once you have that model, you would be able to adjust these args in the config to just return the bands that your model is generating (ex. B04, B08). If you go down this path, I can provide more detail.

4) Yes, sorry about that pseudocode. The actual function documentation can be found here. I have made a note to update the README to be more clear.

Regarding the format of the png files - we did it in this way to keep storage efficient, do all the preprocessing like reprojecting just once, and to have the model train on smaller chunks (32x32 pixels instead of 10kx10k pixels). You could edit the dataset code here to accept .jp2 files instead of our expected .png files. But if you want to use our model weights, you will have to chunk the data into 32x32pixel chunks. Or retrain a model on larger images, but we found that we ran out of memory when trying to train on 512x512pixel images.

Thank you again for your interest. Happy to answer any more questions.

lucalevi commented 1 month ago

Dear Piper,

Many thanks for your kind and well-documented answer! And yes, I figured you might not be in the office ;-)

I am currently discussing the next steps within our company. We are not an AI or machine-learning company, which means training a new model ourselves would require a significant effort on our part—though an exciting one, nonetheless! Regardless, I believe your model, even if it requires retraining, would be an excellent application of super-resolution technology for the Red and NIR bands, which are our current focus.

I will keep you updated if we decide to pursue the retraining path. In that case, should I continue to post messages here, or would you prefer to receive an email?

Thanks again for your response. I'll keep you posted on our decision (though I'm not the one making the final call ^_^).

Cheers from Italy, Luca I

piperwolters commented 1 month ago

Hi Luca,

That is fair! Taking on a new AI project at a non-AI company can be a lot of work, but I will be happy to help the process along if that is what you choose - I can write up more specific, technical notes if this is the case. I am fine with you posting messages here, it might help someone else in the future! But of course feel free to directly email me (piperw@allenai.org) if that is your preference.

Hope that it works out, this sounds like a fun project! Best, Piper

lucalevi commented 1 month ago

All right! I'll keep you posted about what is being decided at the company. Many thanks in advance for you kindness and availability to explain us the functioning of your models. I hope we can do some good work together ;) Cheers!

piperwolters commented 1 month ago

Thank you, Luca! Sounds good, I hope it works out. I will close this issue for now, but feel free to reopen it or open more in the future. :)