dlr-eoc / prosEO

prosEO – A Processing System for Earth Observation Data
GNU General Public License v3.0
14 stars 1 forks source link

Data model: Analyze performance issue in ProductQueryService::executeQuery() #165

Closed tangobravo62 closed 1 year ago

tangobravo62 commented 2 years ago

During the Sentinel-5P mission reprocessing, the releasing of job steps slowed down continually, and the memory requirements increased over time. At some point it was necessary to increase the brain RAM from 12 GiB to 24 GiB to ensure continuing production.

From the logs it was obvious that between 1 and 3 minutes per job step were spent in the ProductQueryService in the executeQuery() method. This happened during the selection of one of the auxiliary products, which are needed for every job step, and thus have a very large set of related ProductQueries. However, there are three such products for Sentinel-1B L1B processing, and it was always the first of them, which was referenced during the release process, but not always the same. The log file pattern looks like this:

2022-07-08 10:09:30.393 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : >>> executeQuery(1298969, false)
2022-07-08 10:09:30.453 DEBUG 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Executing SQL query: [...]
2022-07-08 10:09:30.455 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products found: 1
2022-07-08 10:09:30.456 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Found product 1211466 with start time = 2014-10-01T00:00:01Z, stop time = 2050-12-31T23:59:59Z and revision = 
2022-07-08 10:09:30.456 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products after selection: 1
2022-07-08 10:10:52.132 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products satisfying product query: 1
2022-07-08 10:10:52.132 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : <<< executeQuery()
2022-07-08 10:10:52.135 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : >>> executeQuery(1298970, false)
2022-07-08 10:10:52.234 DEBUG 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Executing SQL query: [...]
2022-07-08 10:10:52.295 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products found: 1
2022-07-08 10:10:52.295 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Found product 1211464 with start time = 2014-10-01T00:00:01Z, stop time = 2050-12-31T23:59:59Z and revision = 
2022-07-08 10:10:52.296 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products after selection: 1
2022-07-08 10:10:52.514 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products satisfying product query: 1
2022-07-08 10:10:52.514 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : <<< executeQuery()
2022-07-08 10:10:52.830 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : >>> executeQuery(1298971, false)
2022-07-08 10:10:52.939 DEBUG 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Executing SQL query: [...]
2022-07-08 10:10:53.019 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products found: 1
2022-07-08 10:10:53.020 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Found product 1211468 with start time = 2014-10-01T00:00:01Z, stop time = 2050-12-31T23:59:59Z and revision = 
2022-07-08 10:10:53.020 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products after selection: 1
2022-07-08 10:10:53.263 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : Number of products satisfying product query: 1
2022-07-08 10:10:53.263 TRACE 1 --- [ReleaseOrder_1262391] d.d.p.model.service.ProductQueryService  : <<< executeQuery()

No matter in which order the three products were selected, it was always for the first one that the wait happened (and the memory usage of the Java VM increased), while for the other ones the wait was below 1 s.

The affected code in ProductQueryService::executeQuery() is the following:

        if (logger.isTraceEnabled()) logger.trace("Number of products after selection: " + selectedItems.size());

        // Set the query's list of satisfying products to the list of selected items (products), unless this was only a dry run
        if (!checkOnly) {
            productQuery.getSatisfyingProducts().clear();
            for (Object selectedItem: selectedItems) {
                if (selectedItem instanceof Product) {
                    Product product = (Product) selectedItem;
                    product.getSatisfiedProductQueries().add(productQuery);
                    productQuery.getSatisfyingProducts().add(product);
                }
            }
            productQuery.setIsSatisfied(true);
            if (logger.isTraceEnabled()) logger.trace("Number of products satisfying product query: " + productQuery.getSatisfyingProducts().size());
        }       
        if (logger.isTraceEnabled()) logger.trace("<<< executeQuery()");

Although a full profiling has not yet taken place, the supposed culprit is the statement product.getSatisfiedProductQueries().add(productQuery);.

katharinabassler commented 1 year ago

Cf. GitHub Issue https://github.com/dlr-eoc/prosEO/issues/162.

tangobravo62 commented 1 year ago

Alas improving/correcting the equals and hashCode methods did nothing to improve the performance substantially. In the end a performance improvement could be reached by not updating the product itself in the query, but just adding the product to the query's list of satisfying products (although this depends on the fact that the relationship from product to product query is not needed any more in the same transaction).

This was changed with commit b808580 and deployed with prosEO 0.9.1 in September 2022.