dlr-eoc / prosEO

prosEO – A Processing System for Earth Observation Data
GNU General Public License v3.0
14 stars 1 forks source link

Planner: Remove output products not generated by a job step #130

Closed tangobravo62 closed 2 years ago

tangobravo62 commented 3 years ago

It was observed that in some cases the selection of input product failed although usable input data was available. An example for this is generating a Sentinel-5P L2AERAI product from a previously generated L1B* product for the same orbit. The L2AER_AI product class requires L1B_BD* input from the same orbit (ValCover policy) and some usable L1B_IR_UVN input (not necessarily the same orbit, may be [much] older; selection policy LatestValidityClosest).

The root cause for the failure is that – quite correctly – the L1B processor did not generate L1B_IR_UVN and L1B_IR_SIR files for S5P orbit 8138. This is OK, since only one orbit per day actually contains irradiance measurements (in the case of 2019-05-10 it's orbit 8143). This is a behaviour we also observed for other processors, esp. in the Sentinel-1 and Sentinel-3 missions. So it is a behaviour, for which prosEO must be prepared.

The problem now is that the expected output products, which the Production Planner generated, are still around after the job has been completed. This is not an issue w.r.t. the PRIP, where only products with product files on some processing facility will be shown. But it is a problem if the product classes of such products are referenced in other selection rules (at least for some selection policies including the LatestValidityClosest policy used to retrieve IR_UVN for AER_AI):

In a first step, the metadata database is searched for products "closest" to the validity period of the product to generate. This selects the empty IR_UVN "shell" product, but not the older products, which have product files (but are not "closest"). Then to determine the product availability for processing a check is done, whether the product actually has a product file. Since this is not the case, the job step remains "WAITING_INPUT", expecting some other job to generate the missing file. This behaviour is by design, because there could actually be another process, which will eventually generate the product file.

But in the given case we know that the product will never arrive – because there are no satellite measurements to generate it. Therefore the solution approach shall be implemented in the Production Planner as follows:

If a job step is completed successfully, but did not generate files for all output products, delete the empty "shell" products.

emelchinger commented 3 years ago

The products of a job step were deleted after successful finish if the don't have a product file.

tangobravo62 commented 2 years ago

Proven during various tests with Sentinel-5P L1B products.