land-of-apps / waltz

Enterprise Information Service
https://waltz.finos.org
Apache License 2.0
0 stars 0 forks source link

[BUG] When creating a flow, the front end sends empty string instead of 'null' when the user does not provide a value for external ID #5

Closed Daniel-Warner-X closed 3 months ago

Daniel-Warner-X commented 3 months ago

Description

Attempting to insert a physical flow without an external id fails with a database error:

09:26:22.188 [qtp526903990-33] ERROR org.finos.waltz.web.Main - Data integrity violation detected: jooq; SQL [insert into "physical_flow" ("specification_id", "basis_offset", "frequency", "transport", "description", "provenance", "last_updated_at", "last_updated_by", "logical_flow_id", "is_removed", "last_attested_at", "last_attested_by", "criticality", "external_id", "created_at", "created_by", "name") values (?, ?, ?, ?, ?, ?, cast(? as timestamp), ?, ?, ?, cast(? as timestamp), ?, ?, ?, cast(? as timestamp), ?, ?) returning "physical_flow"."id"]; ERROR: null value in column "external_id" of relation "physical_flow" violates not-null constraint
  Detail: Failing row contains (613, 125, 0, DAILY, EMAIL, , waltz, 2024-08-04 09:26:21.853151, admin, 419, null, f, null, null, LOW, null, ACTIVE, 2024-08-04 09:26:21.853151, admin, f, null).; nested exception is org.postgresql.util.PSQLException: ERROR: null value in column "external_id" of relation "physical_flow" violates not-null constraint

The expected default behavior for a Physical Flow created without an explicitly set external ID is that one will be automatically generated default value specified for the external_id column of the physical_flow table.

Logical flows should behave in the same way.

Steps to Reproduce

  1. Attempt to create a new physical flow.
  2. Do not set an external ID during the creation process.
  3. Observe the creation behavior.

Expected Result

The creation of a physical flow should succeed when an external id is not specified.

Actual Result

The physical flow creation fails when an external ID is not provided.

Daniel-Warner-X commented 3 months ago

@apotterri I asked Navie to flesh out this minimally described issue and the above issue is was the result. I think it might be the reverse of the actual issue, due to how the original issue was worded . I tried creating a physical flow without an external ID using a version of the codebase that supposedly has the problem. Can you tell from the merged fix weather this description is accurate? Merged fix: https://github.com/finos/waltz/pull/7109/files

Daniel-Warner-X commented 3 months ago

@apotterri my goal is to get the issue properly characterized here and then post Navie's plan.

Daniel-Warner-X commented 3 months ago

@apotterri I have to go ahead and modify this issue for screenshots, so I am going to update the description to something I think is more accurate.

apotterri commented 3 months ago

Based on the changes in 7109, it looks to me like the description above is backwards. After 7109 was merged, an external id would be required when creating a physical flow. (That's the way I'd interpret "Physical Flows Cannot Be Created Without Setting an External ID" as well.)

apotterri commented 3 months ago

Sorry it took me a bit to answer this. I was trying to make sure I really believed what I was going to say.... ;)

Daniel-Warner-X commented 3 months ago

Title

Enforce External ID Requirement During Physical Flow Creation

Problem

The current implementation for creating physical flows does not require an external ID to be explicitly set. The expected default behavior, which should not allow the creation of flows without a manually provided external ID, is not functioning correctly. As a result, users can create flows without an external ID, which leads to potential issues with data consistency and traceability.

Analysis

The problem arises from the fact that no validation exists to ensure that an external ID is provided when creating a physical flow. If this ID is not set, the flow creation process should either fail or generate a default value to maintain data integrity.

To solve this issue, we need to update the flow creation mechanism to enforce that an external ID is either provided by the user or generated automatically. This will involve changes to the service and DAO layers to add this validation step and generate an external ID if it is not provided.

Root Cause

  1. No validation exists to ensure that an external ID is provided during physical flow creation.
  2. The PhysicalFlowCreateCommand does not enforce external ID presence.
  3. The PhysicalFlowService does not check for the external ID before creating a flow.
  4. The ExternalIdentifierDao does not validate the presence of an external ID before inserting it into the database.

Proposed Solution

  1. Add validation in the PhysicalFlowCreateCommand to ensure an external ID is included.
  2. Update the PhysicalFlowService to check for an external ID and generate one if it is missing.
  3. Ensure the PhysicalFlowDao and ExternalIdentifierDao enforce the requirement of an external ID.
  4. Add appropriate handling in the UI to notify the user if an external ID is missing during flow creation.

Proposed Changes

  1. Modification to PhysicalFlowCreateCommand class:

    Ensure that the external ID is included as a part of the creation command and validate it.

    public abstract class PhysicalFlowCreateCommand implements Command {
    
       public abstract PhysicalSpecification specification();
    
       public abstract long logicalFlowId();
    
       public abstract FlowAttributes flowAttributes();
    
       public abstract String externalId();  // Ensure external ID is included
    
       @Value.Check
       protected void check() {
           if (externalId() == null || externalId().isEmpty()) {
               throw new IllegalArgumentException("External ID must be provided");
           }
       }
    
       @Value.Default
       public Set<Long> dataTypeIds() {
           return emptySet();
       }
    }
  2. Modification to PhysicalFlowService class:

    Update the createPhysicalFlow method to validate or generate the external ID if it is missing.

    public PhysicalFlowCreateResponse createPhysicalFlow(PhysicalFlowCreateCommand command) {
       if (command.externalId() == null || command.externalId().isEmpty()) {
           command = ImmutablePhysicalFlowCreateCommand.copyOf(command)
               .withExternalId(generateExternalId());
       }
       // Proceed with the rest of the creation logic
    }
    
    private String generateExternalId() {
       // Generate a unique external ID, e.g., using UUID
       return UUID.randomUUID().toString();
    }
  3. Modification to ExternalIdentifierDao class:

    Ensure the presence of an external ID before attempting to insert it into the database.

    public void create(ExternalIdentifier externalIdentifier) {
       if (externalIdentifier.externalId() == null || externalIdentifier.externalId().isEmpty()) {
           throw new IllegalArgumentException("External ID must be provided");
       }
       // Proceed with the insertion logic
    }
  4. Frontend changes to PhysicalFlowDetailTable (svelte component):

    Handle the absence of an external ID by showing a validation message to the user.

    <script>
       import { createPhysicalFlow } from 'PhysicalFlowService';
    
       let externalId = '';
    
       function handleSave() {
           if (!externalId) {
               alert('External ID is required');
               return;
           }
           createPhysicalFlow({ externalId, /* other necessary data */ })
               .then(response => {
                   // handle success
               })
               .catch(error => {
                   // handle error
               });
       }
    </script>
    
    <button on:click={handleSave}>Save Physical Flow</button>

These changes will enforce the requirement of an external ID during physical flow creation, enhancing data consistency and integrity.

Daniel-Warner-X commented 3 months ago

Based on the changes in 7109, it looks to me like the description above is backwards. After 7109 was merged, an external id would be required when creating a physical flow. (That's the way I'd interpret "Physical Flows Cannot Be Created Without Setting an External ID" as well.)

@apotterri Cool. Thanks for checking it out. Please take a look at the modifications I made to the issue, I reversed the meaning so it should be correct. Now. Also what do you think about the plan? Does it do the same job as the actual changes that were made?

apotterri commented 3 months ago

It seems like the sense of the description is right now. The one thing that stills seems a little off is this:

The system should handle the absence of an external ID gracefully, possibly by generating one automatically or by assigning a default value. It's not at all clear to me that it should attempt to generate a default id (the original changes definitely didn't do this).

The changes in the plan don't do the same job as the actual changes. To my reading (given that I'm not an expert in the waltz codebase), the actual changes ensure that an attempt to create either a logical or physical flow without an external id will fail with a database error.

Prior to the changes, creating a flow without an external id would set the external id to an empty string, which could be inserted into the database. This is because the constraint on the external id column is only NOT NULL, but there's no check that it's not an empty string.

After the changes, creating a flow without an external id will try to insert a row with a NULL external id into the database, which will fail (because of the constraint).

The actual changes don't have any kind of error handling in them, either for the database errors, or to show errors to the user. I'm guessing this is because that error handling already exists elsewhere.

My somewhat equivocal response here definitely reflects the need for the human in the loop when using Navie, and for that human to be expert in the code base. As always, the changes she suggests are plausible, but may or may not be desirable. (This is more or less the same problem I had with DB's changes for the python agent. They worked, but didn't fit with the design of the code they changed.)

Daniel-Warner-X commented 3 months ago

Thanks for the breakdown.

What I glean from your comment is this 1. The error handling piece as stated in the issue needs to be corrected. 2. The validity of the proposed solution is still TBD.

In regards to the first issue, error handling, Navie had this to say:

Sure! We can look for an example of database error handling in the code when a new physical flow is created. Here’s a concise example based on the service and DAO layers that handle physical flows.

From the context, the PhysicalFlowService and ExternalIdentifierService seem to be related to managing physical flows. Let's locate parts of these classes that involve creating a new physical flow and observe their error handling.

Example in PhysicalFlowService

First, let's see if there's any existing error handling in the service class for creating physical flows.

@Service
public class PhysicalFlowService {
    private final PhysicalFlowDao physicalFlowDao;

    @Autowired
    public PhysicalFlowService(PhysicalFlowDao physicalFlowDao) {
        this.physicalFlowDao = physicalFlowDao;
    }

    @Transactional
    public void createPhysicalFlow(PhysicalFlowCreateCommand command) {
        try {
            // Assuming there's a create method in PhysicalFlowDao
            physicalFlowDao.create(command);
        } catch (DataAccessException ex) {
            // Handle specific database errors here
            logger.error("Error creating physical flow", ex);
            throw new CustomServiceException("Error creating physical flow", ex);
        }
    }
}

Example in PhysicalFlowDao

Now let's look at the corresponding DAO method that might involve the actual database operation.

@Repository
public class PhysicalFlowDao {
    private final DSLContext dsl;

    @Autowired
    public PhysicalFlowDao(DSLContext dsl) {
        this.dsl = dsl;
    }

    public void create(PhysicalFlowCreateCommand command) {
        try {
            dsl.transaction(configuration -> {
                DSLContext tx = DSL.using(configuration);
                // SQL insert logic here
                tx.insertInto(PHYSICAL_FLOW)
                  .set(PHYSICAL_FLOW.NAME, command.getName())
                  // other fields
                  .execute();
            });
        } catch (DataAccessException ex) {
            // Catch and handle database specific exceptions
            logger.error("Failed to execute SQL insert for physical flow", ex);
            throw ex;  // Re-throw or wrap in a custom exception
        }
    }
}

In this setup:

Custom Exception Handling

To make this handling effective, you can use custom exceptions to ensure the errors are meaningful and standardized.

public class CustomServiceException extends RuntimeException {
    public CustomServiceException(String message, Throwable cause) {
        super(message, cause);
    }
}

Summary

This example demonstrates handling database errors when creating new physical flows by catching exceptions in the DAO layer and handling them in the service layer. Each layer logs meaningful messages, and the service layer optionally wraps the exception in a custom service-level exception. This helps manage different types of errors consistently and ensures the application can respond appropriately.


If the existing error handling is sufficient we can leave it as is, and I will update the issue. Can you validate that the error handling is correct?

In regards to the second issue, the validity of the response, we are the humans in the loop 😄 ! It sounds like we have achieved plausibility, but we need to understand if it truly fix the issue (ie gets a test to pass). In terms of desirability aka jibing enough with the existing design, we need to understand if it does or doesn't. If it doesn't, why?

Daniel-Warner-X commented 3 months ago

Nest steps: DB suggestion - delete extraneous AppMap data and re-query using AppMap data constrained to the physical flow creation case exclusively. EL suggestion - get Navie to write a test that evaluates the expected outcomes and use that to eval the quality of the solution.

Daniel-Warner-X commented 3 months ago

I deleted the old AppMap data and created new data specific to the creation of a physical flow. Then I prompted Navie with this: @explain this issue from the Waltz codebase...

Navie's response:

To address the issue of ensuring that an external ID is required when creating a physical flow, you need to modify the service layer where the physical flow is created. Specifically, you must add a validation step to check for the presence of an external ID, and if it is not provided, the process should fail.

Here's a concise outline of the changes you can make:

  1. Validate Command Input: Ensure that the PhysicalFlowCreateCommand has a non-null and non-empty externalId field.

  2. Update Service Method: Modify the createPhysicalFlow method in PhysicalFlowService to include the validation logic.

Step 1: Validate Command Input

In the PhysicalFlowUploadService class, you can add logic to validate that the externalId is provided.

import java.util.Optional;
import org.springframework.util.StringUtils;

// Existing imports
import org.finos.waltz.model.physical_flow.PhysicalFlowUploadCommandResponse;

// ... other imports and annotations

@Service
public class PhysicalFlowUploadService {

   // ... other members and methods

   public CommandOutcome createPhysicalFlow(PhysicalFlowCreateCommand command) {
       // Validate external ID
       if (!StringUtils.hasText(command.getExternalId())) {
           throw new IllegalArgumentException("External ID must be provided");
       }

       // Existing logic to create a physical flow
       // ...
   }
}

Step 2: Update Method in PhysicalFlowService

Ensure that the PhysicalFlowService correctly handles the creation logic and calls the PhysicalFlowUploadService.

@Service
public class PhysicalFlowService {

   private final PhysicalFlowUploadService physicalFlowUploadService;

   @Autowired
   public PhysicalFlowService(PhysicalFlowUploadService physicalFlowUploadService) {
       this.physicalFlowUploadService = physicalFlowUploadService;
   }

   public PhysicalFlow createPhysicalFlow(PhysicalFlowCreateCommand command) {
       // Delegate to the upload service for creation logic
       CommandOutcome outcome = physicalFlowUploadService.createPhysicalFlow(command);
       if (outcome == CommandOutcome.SUCCESS) {
           // Handle successful creation
       } else {
           // Handle failure, possibly by throwing an exception
       }
   }
}

Step 3: Testing the Changes

Make sure to implement and run tests to verify that the new validation logic is working as expected.

@Test
public void createPhysicalFlow_throwsException_ifExternalIdNotProvided() {
    PhysicalFlowCreateCommand command = new PhysicalFlowCreateCommand();
    command.setExternalId("");  // No external ID provided

    IllegalArgumentException exception = assertThrows(
        IllegalArgumentException.class,
        () -> physicalFlowService.createPhysicalFlow(command)
    );
    assertEquals("External ID must be provided", exception.getMessage());
}

This way, the physical flow creation process is now enforced to require an external ID, meeting the expected default behavior.

If you have any more questions or need further changes, feel free to ask!

Navie's response when using @plan instead of @explain:

Title

Enforce External ID Requirement in Physical Flow Creation

Problem

The current implementation for creating physical flows does not enforce the requirement for an external ID to be explicitly set. As a result, users can create physical flows without providing an external ID, which is against the expected behavior. The default behavior should prevent the creation of any physical flow without a manually provided external ID.

Analysis

The issue stems from the fact that there is no validation step to check for the presence of an external ID during the creation of a physical flow. Consequently, the system allows physical flows to be created without an external ID. To resolve this, a validation check needs to be introduced in the PhysicalFlowService to ensure that the external ID is provided and is not null or empty. If the external ID is missing, the flow creation process should be aborted with an appropriate error message.

Proposed Changes

  1. Update PhysicalFlowCreateCommand Validation:

    • Modify the PhysicalFlowCreateCommand to ensure that the external ID is included.
    • Add a validation step in the service layer to check for the presence of an external ID.
  2. Modify PhysicalFlowService:

    • Introduce a validation check in the createPhysicalFlow method to ensure that the external ID is not null or empty.
    • If the external ID is missing, throw an IllegalArgumentException with an appropriate error message.
  3. Update the Flow Creation Logic:

    • Ensure that any method or component invoking the creation of a physical flow includes the external ID.

Changes in PhysicalFlowCreateCommand

public class PhysicalFlowCreateCommand {
    // Existing fields

    @NotBlank(message = "External ID must be provided")
    private String externalId;

    // Getter and Setter for externalId
}

Changes in PhysicalFlowService

@Service
public class PhysicalFlowService {
    // Other fields and methods

    public CommandOutcome createPhysicalFlow(PhysicalFlowCreateCommand command) {
        if (!StringUtils.hasText(command.getExternalId())) {
            throw new IllegalArgumentException("External ID must be provided");
        }

        // Existing logic to create physical flow
    }
}

Changes in PhysicalFlowUploadService

@Service
public class PhysicalFlowUploadService {
    // Other fields and methods

    public CommandOutcome createPhysicalFlow(PhysicalFlowCreateCommand command) {
        // Delegate call to PhysicalFlowService with appropriate validation
        return physicalFlowService.createPhysicalFlow(command);
    }
}

Validation and Error Handling

To ensure the changes are correctly implemented, the following test case should be added to confirm that the validation logic is working as expected.

Test Case in PhysicalFlowServiceTest

@Test
public void createPhysicalFlow_throwsException_ifExternalIdNotProvided() {
    PhysicalFlowCreateCommand command = new PhysicalFlowCreateCommand();
    command.setExternalId("");  // No external ID provided

    IllegalArgumentException exception = assertThrows(
        IllegalArgumentException.class,
        () -> physicalFlowService.createPhysicalFlow(command)
    );
    assertEquals("External ID must be provided", exception.getMessage());
}

This test ensures that any attempts to create a physical flow without an external ID will result in an IllegalArgumentException.

By implementing these changes, you ensure that the application correctly enforces the requirement for an external ID during the physical flow creation process, thus aligning with the expected behavior.