CNES / MAJA

Level-2A processor used for atmospheric correction and cloud-detection. The active repository is the one below, this one is kept to leave access to the older issues.
https://gitlab.orfeo-toolbox.org/maja/maja
Apache License 2.0
137 stars 25 forks source link

Optimal parameters to get the best results (advice needed) #91

Open toshaklg opened 3 years ago

toshaklg commented 3 years ago

Hello! I am trying to familiarize myself with MAJA and I want to ask some questions, if you don't mind. I might confuse some terms, but I hope I explain it clearly.

  1. As far as I understand MAJA is supposed to process a time-series data, so it doesn't really make sense to run it with just one image which result in the end will be of "inferior quality". Therefore, I wonder what is the most appropriate size of a time-series (number of images) to get the most accurate results?

  2. Judging by the documentation, the "nominal" run will process the first image in "init" mode which means that L2 for the very first entry won't be the best quality. If I process the same time series in "backwards" mode, it will increase the quality of the first result, but does it affect other L2 images anyhow?

  3. How can I make startmaja run in a backward mode? Looks like --nbackward 1 is right command, is it?

  4. I noticed that in my previous ticket I had the following workplans:

    2021-04-20 16:04:00,429 [INFO ] 5 workplan(s) successfully created: DATE | TILE | MODE | L1-PRODUCT | ADDITIONAL INFO 2017-01-03 10:44:32 | 31TFJ | INIT | S2A_MSIL1C_20170103T104432_N0204_R008_T31TFJ_20170103T104428.SAFE | Init mode - No previous L2 2017-01-10 10:34:11 | 31TFJ | NOMINAL | S2A_MSIL1C_20170110T103411_N0204_R108_T31TFJ_20170110T103407.SAFE | L2 from previous 2017-01-13 10:44:01 | 31TFJ | NOMINAL | S2A_MSIL1C_20170113T104401_N0204_R008_T31TFJ_20170113T104402.SAFE | L2 from previous 2017-01-30 10:32:51 | 31TFJ | NOMINAL | S2A_MSIL1C_20170130T103251_N0204_R108_T31TFJ_20170130T103249.SAFE | L2 from previous 2017-03-14 10:40:11 | 31TFJ | INIT | S2A_MSIL1C_20170314T104011_N0204_R008_T31TFJ_20170314T104411.SAFE | Init mode - No previous L2 Press Enter to continue...

What I am confused with is that the last image is going to be processed in the "INIT" mode, so I wonder why? Am I missing something? I was expecting that "nominal" mode will process only the first image in "INIT" and all others in "NOMINAL", no?

I am looking forward to your replies, thanks!

jerome-colin commented 3 years ago

Dear Anton, One of the main interest of Maja as compared to other atmospheric processors is indeed the multitemporal approach. The idea is to build up a composite reflectance image from successive cloud-free observations of the surface in chronological order. Of course, if it started in strict chronological order, the first product wouldn't benefit from such approach. That's what's backward is made for : start the processor in reverse chronological order to build the composite that will then be used to process the whole series in normal chronological order. The next question is of course how much products you should ingest in the backward. That mainly depends on the cloud coverage of your area of interest (the more clouds, the more products you need to use to build the composite). Usually, a backward on 8 products is a good assumption. Now just to sort it out with the vocabulary : INIT means no composite, so you just use multispectale criteria (and get poorer results). BACKWARD means you create the composite from n products to then start the processor from the beginning of the series with the composite (the later being the NOMINAL mode). Of course, you only use the backward mode if you don't have any previous L2A products. If you had to append another month to your time series, you could just start Maja in nominal mode and it would automatically benefit from prior processed products.

You'll find additional information on our blog, in particular here : https://labo.obs-mip.fr/multitemp/maccs-how-it-works/

Hope this helps, Jerome

jerome-colin commented 3 years ago

In addition to my previous answer : --nbackward option takes an integer value referring to the number of products to use in backward mode (it defaults to 8 if unspecified). As an example, if you have 8 L1C products in your input directory, the default workplan will be to make a backward of 8 products, then process them all in chronological (nominal) mode.

Jerome

toshaklg commented 3 years ago

In addition to my previous answer : --nbackward option takes an integer value referring to the number of products to use in backward mode (it defaults to 8 if unspecified). As an example, if you have 8 L1C products in your input directory, the default workplan will be to make a backward of 8 products, then process them all in chronological (nominal) mode.

Jerome

Thanks a lot for the replies!

toshaklg commented 3 years ago

Nevertheless, it is still not clear for me what is the default sequence. You are referring to the "default workplan", what is that?

the default workplan will be to make a backward of 8 products, then process them all in chronological (nominal) mode.

Maja refuses to work if I keep the output from the previous run, therefore it is either nominal or backward mode, I can't combine them, which is a bit weird, in my opinion.

The thing is that if run maja on a test data (5 images) in nominal mode (no flags, nothing, just start maja) and then separately run maja with --nbackward 4, results from backward processing are better for all images except the last one while time difference is only 5 min or so. Feels like I am doing something wrong, but I can't figure out what exactly.

UPD: What I just realize, since backward mode is supposed to improve the quality of the fist image, then It makes sense to run the processing in the backward mode first, then the modes will be assigned this way: 1 image - backward 2 image - nominal 3 image - nominal ... last image - init

In this case we get a better quality results for the first product that will improve the quality of the following one (2).

Later one if I want to append another month, I just run it in nominal since I have previous data already.

Am I right?

jerome-colin commented 3 years ago

The 'workplan' is what startmaja prepares for you. As an example:

2021-04-26 14:47:11,400 [INFO ] 19 workplan(s) successfully created:
               DATE |       TILE |     MODE |                                                             L1-PRODUCT | ADDITIONAL INFO
2020-05-01 10:20:31 |      32TPR | BACKWARD |      S2A_MSIL1C_20200501T102031_N0209_R065_T32TPR_20200501T123320.SAFE | Backward of 8 products
2020-05-03 10:05:49 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200503T100549_N0209_R022_T32TPR_20200503T130430.SAFE | L2 from previous
2020-05-06 10:15:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200506T101559_N0209_R065_T32TPR_20200506T140537.SAFE | L2 from previous
2020-05-08 10:10:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200508T101031_N0209_R022_T32TPR_20200508T122315.SAFE | L2 from previous
2020-05-11 10:20:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200511T102031_N0209_R065_T32TPR_20200511T122140.SAFE | L2 from previous
2020-05-13 10:05:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200513T100559_N0209_R022_T32TPR_20200513T135455.SAFE | L2 from previous
2020-05-13 10:05:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200513T100559_N0209_R022_T32TPR_20200513T124815.SAFE | L2 from previous
2020-05-16 10:15:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200516T101559_N0209_R065_T32TPR_20200516T123335.SAFE | L2 from previous
2020-05-18 10:10:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200518T101031_N0209_R022_T32TPR_20200518T121146.SAFE | L2 from previous
2020-05-21 10:20:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200521T102031_N0209_R065_T32TPR_20200521T122533.SAFE | L2 from previous
2020-05-23 10:05:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200523T100559_N0209_R022_T32TPR_20200523T122236.SAFE | L2 from previous
2020-05-26 10:15:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200526T101559_N0209_R065_T32TPR_20200526T131155.SAFE | L2 from previous
2020-05-28 10:10:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200528T101031_N0209_R022_T32TPR_20200528T111757.SAFE | L2 from previous
2020-05-31 10:20:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200531T102031_N0209_R065_T32TPR_20200531T123341.SAFE | L2 from previous
2020-06-02 10:05:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200602T100559_N0209_R022_T32TPR_20200602T130134.SAFE | L2 from previous
2020-06-05 10:15:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200605T101559_N0209_R065_T32TPR_20200605T131223.SAFE | L2 from previous
2020-06-07 10:10:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200607T101031_N0209_R022_T32TPR_20200607T122608.SAFE | L2 from previous
2020-06-10 10:20:31 |      32TPR |  NOMINAL |      S2A_MSIL1C_20200610T102031_N0209_R065_T32TPR_20200610T124418.SAFE | L2 from previous
2020-06-12 10:05:59 |      32TPR |  NOMINAL |      S2B_MSIL1C_20200612T100559_N0209_R022_T32TPR_20200612T124515.SAFE | L2 from previous

Here I don't give explicitly the --nbackward. Startmaja found enough L1C in input directory to automatically decide to start with a backward with the first 8 products, then process them all in nominal mode up to the last. There's no need for init mode for the last product as you mentioned. The last product benefits from the composite of all the previous ones.

There's obviously something unclear about the way you use maja, I can't understand why it would refuse to work while you already have files in the output directory. Could you show the latest startmaja command you are using ?

olivierhagolle commented 3 years ago

My guess is that since you have already processed a part of the products, start-maja decides not to reprocess them. That way you can use the same command when a new product arrives to process it progressively in real time.

toshaklg commented 3 years ago

Well, for example now the output folder contains 5 products that were processed with backward flag and if I run

./startmaja -f folders.txt -t 31TFJ

then this is what I get

2021-04-27 10:29:28,663 [INFO ] =============This is Start_Maja v4.2.0============== 2021-04-27 10:29:28,669 [INFO ] Detecting input products... 2021-04-27 10:29:28,674 [INFO ] 5 L1C product(s) detected for tile 31TFJ in /home/anton/work/MAJA_INPUT/31TFJ 2021-04-27 10:29:28,675 [INFO ] 5 L2A product(s) detected for tile 31TFJ in /home/anton/work/MAJA_OUTPUT/31TFJ 2021-04-27 10:29:28,675 [INFO ] Skipping CAMS file detection. 2021-04-27 10:29:28,675 [INFO ] Checking GIPP files 2021-04-27 10:29:28,675 [INFO ] Setting up GIPP folder: /home/anton/work/MAJA_JOBS/gipp 2021-04-27 10:29:28,683 [INFO ] Searching for DTM 2021-04-27 10:29:28,684 [INFO ] Found DTM: /home/anton/work/MAJA_JOBS/dtm/S2__TEST_AUX_REFDE2_31TFJ_3001.HDR 2021-04-27 10:29:28,691 [INFO ] GIPP Creation succeeded for SENTINEL2_TM Traceback (most recent call last): File "/home/anton/work/MAJA_R/lib/python/StartMaja/Start_maja.py", line 555, in s.run() File "/home/anton/work/MAJA_R/lib/python/StartMaja/Start_maja.py", line 491, in run workplans = self.create_workplans(self.max_product_difference) File "/home/anton/work/MAJA_R/lib/python/StartMaja/Start_maja.py", line 464, in create_workplans raise ValueError("No workplans were created!") ValueError: No workplans were created!

jerome-colin commented 3 years ago

Ok, 5 L1C, 5L2A, nothing more to do. @olivierhagolle guessed it right. Now if you add another L1C to your input directory (the next in chronological order), the workplan should be to process it in nominal mode from the previous L2A.

olivierhagolle commented 3 years ago

Moreover, we did not answer on why, on your initial post, the last product was processed in Init mode.

2017-01-30 10:32:51 | 31TFJ | NOMINAL | S2A_MSIL1C_20170130T103251_N0204_R108_T31TFJ_20170130T103249.SAFE | L2 from previous
2017-03-14 10:40:11 | 31TFJ | INIT | S2A_MSIL1C_20170314T104011_N0204_R008_T31TFJ_20170314T104411.SAFE | Init mode - No previous L2

The reason is that to work in nominal mode, the time lag between two successive products should be lower than a parameter. Its default value is 45 days. MAJA is meant to process all the dates in a time series, while obviously they were not all provided.

toshaklg commented 3 years ago

Yeah, now I understand, thanks! Just to show what I got - the differences between nominal and backward run on 5 images: Comparison

I didn't put a true color image, but it can be seen that backward is more detailed (and more accurate)