jjbuchan / docs

0 stars 0 forks source link

Transactions & Lazy Loading #20

Open jjbuchan opened 3 years ago

jjbuchan commented 3 years ago

Problem

When retrieving an object from the database with Hibernate it will only populate the collection fields that are eagerly loaded. i.e. a collection field with the FetchType.EAGER option.

As only one collection on the entity can be fetched using EAGER, you will end up with null values for any other.

Solution

To populate the lazily loaded fields you can wrap them in a Hibernate.initialize() call.

For example, if you need to retrieve a monitor and access its metadata field values:

Monitor monitor = getMonitor(originalTenant, monitorId).orElseThrow(() ->
        new NotFoundException(String.format("No monitor found for %s on tenant %s",
            monitorId, originalTenant)));

Hibernate.initialize(monitor.getMonitorMetadataFields());
Hibernate.initialize(monitor.getPluginMetadataFields());

This method will perform 3 sql queries. One select (with a join on labelSelector) to get the initial monitor, one select to populate monitorMetadataFields, and one select to populate pluginMetadataFields.

Transactions

To allow for all requests to the above code to work the @Transactional annotation must be used on the top level method. The above example relates to cloning monitors. For that scenario there are three ways it can be triggered. Either via an api call to clone a monitor from one customer tenant to another, via a kafka event to clone a policy monitor to a customer tenant, or via a test case.

This means we have to annotate the kafka consumer and any tests that lead to the clone method being triggered. API controllers are transactional by default so no additional annotation is required there. The annotation is also not needed on the exact method containing the initialize logic.

No annotation on main clone method.

  public Monitor cloneMonitor(String originalTenant, String newTenant, UUID monitorId, UUID policyId) {

No annotation on api controller.

  @PostMapping("/admin/clone-monitor")
  @ResponseStatus(HttpStatus.CREATED)
  public DetailedMonitorOutput cloneMonitor(@RequestBody final CloneMonitorRequest input)

Annotation on kafka consumer - this is the entry point which will lead to clone being called

  @KafkaHandler
  @Transactional
  public void consumeMonitorPolicyEvents(MonitorPolicyEvent policyEvent) {

Annotation on any tests calling the clone method

  @Test
  @Transactional
  public void testCloneMonitor_usingMetadata() throws JsonProcessingException {

Why only one EAGER collection?

Using FetchType.EAGER leads to a join being done between the main entity's table and the collection's table. Each join performed leads to an increase in the number of rows of the dataset.

In the above code example, if we were instead to eagerly load all fields for a monitor that has 10 labels, 10 monitorMetadataFields, and 10 pluginMetadataFields, we would end up with a data set containing 1,000 rows (10 labels 10 monitor metadata 10 plugin metadata).

If we were performing a getAllMonitors request for an account that has 50 monitors like that one, we would end up with 50,000 rows to represent those 50 monitors.

This is highly inefficient.

Mutliple EAGER collections

Using Set instead of List will allow you to bypass the Hibernate errors related to Lazy loading (described in https://github.com/jjbuchan/docs/issues/16) but it does not remove the inefficiencies described above and is not recommended.