getappmap / appmap-js

Client libraries for AppMap
47 stars 16 forks source link

Large sequence diagram is omitted from the Navie context #1935

Open apotterri opened 1 month ago

apotterri commented 1 month ago

I have this AppMap data file in my project: org_finos_waltz_integration_test_inmem_service_PhysicalFlowServiceTest_create_PhysicalFlowUsesDefaultExternalIdIfNotSpecified.appmap.json

When I ask Navie a question, it sees the file, but doesn't add it to the context.

Logging in AppMap: Services Output view when DEBUG=true:

56633 [Stdout] Remaining characters before context: 15741
56633 [Stdout] Skipping context item sequence-diagram /Users/ajp/src/finos/waltz/tmp/appmap/junit/org_finos_waltz_integration_test_inmem_service_PhysicalFlowServiceTest_create_PhysicalFlowUsesDefaultExternalIdIfNotSpecified.appmap.json due to size. 7521 > 3148.2
56633 [Stdout] Context item sequence-diagram /Users/ajp/src/finos/waltz/tmp/appmap/junit/org_finos_waltz_integration_test_inmem_service_PhysicalFlowServiceTest_create_PhysicalFlowUsesDefaultExternalIdIfNotSpecified.appmap.json status: item_too_big
kgilpin commented 1 month ago

So we should be using a bigger token limit at least in some cases.

github-actions[bot] commented 1 month ago

Title

Adjust sequence diagram context inclusion logic to enforce priority for inclusion

Problem

Sequence diagrams are being omitted from the Navie context due to a token limit constraint, which hinders debugging and understanding of the application's behavior during end-to-end tests.

Analysis

AppMap sequence diagrams play a crucial role in visualizing the flow of data and interactions within an application. Current Navie configuration logic excludes these diagrams based on a conservative token limit, even when they are highly relevant.

To address this issue, the system needs to be configured to prioritize sequence diagrams within the context. The new logic must ensure that at least one sequence diagram is always included, and at least five code snippets should also be present simultaneously.

This can be achieved by overriding the token limit for sequence diagrams. In scenarios where the available token budget is constrained, the system should still include at least one sequence diagram and the allowed number of code snippets, even if it means slightly exceeding the token limit.

Proposed Changes

  1. Configuration Update:

    • Update the logic responsible for managing the AppMap context to ensure that sequence diagrams are prioritized.
    • Create a mechanism to allow at least one sequence diagram while balancing the inclusion of five code snippets within the token limit.
  2. File Change:

    • Modify the configuration logic in the relevant analyzeAppMaps.ts file to enforce the new inclusion criteria.
  3. Packages and Files Involved:

    • Modify settings in the packages/cli package:
    import FindCodeObjects from '../search/findCodeObjects';
    import { CodeObject } from '../search/types';
    import { ProcessFileOptions, processFiles } from '../utils';
    import Specification from '@appland/sequence-diagram/dist/specification';
    import Priority from '@appland/sequence-diagram/dist/priority';
    
    type CodeObjectId = string;
    
    export default async function analyzeAppMaps(
      appmapDir: string,
      codeObjectPatterns: string[]
    ): Promise<{ appmaps: string[]; specification: Specification }> {
      const requiredPatterns = new Set<string>();
      const uniquePatterns = new Set<string>();
    
      const priority = new Priority();
      const requiredCodeObjectIds = new Set<CodeObjectId>();
      const includedCodeObjectIds = new Set<CodeObjectId>();
    
      const interpretCodeObjectPattern = (pattern: string) => {
        if (pattern.startsWith('+')) {
          pattern = pattern.slice(1);
          requiredPatterns.add(pattern);
        }
    
        uniquePatterns.add(pattern);
        priority.enrollPattern(pattern);
      };
    
      codeObjectPatterns.forEach(interpretCodeObjectPattern);
      const appmaps = new Set<string>();
    
      await Promise.all(
        [...uniquePatterns]
          .filter((pattern) => !requiredPatterns.has(pattern))
          .map(async (codeObjectPattern) => {
            const matches = await new FindCodeObjects(appmapDir, codeObjectPattern).find();
            const codeObjectIds = new Set<CodeObjectId>();
            matches.forEach((match) => {
              collectCodeObject(includedCodeObjectIds, match.codeObject);
              collectCodeObject(codeObjectIds, match.codeObject);
            });
            priority.expandPattern(codeObjectPattern, [...codeObjectIds]);
          })
      );
    
      if (requiredPatterns.size > 0) {
        const appmapsForRequiredPattern = await Promise.all(
          [...requiredPatterns].map(async (codeObjectPattern) => {
            const matches = await new FindCodeObjects(appmapDir, codeObjectPattern).find();
            const codeObjectIds = new Set<CodeObjectId>();
            const appmaps = new Set<string>();
            matches.forEach((match) => {
              appmaps.add(match.appmap);
              collectCodeObject(includedCodeObjectIds, match.codeObject);
              collectCodeObject(requiredCodeObjectIds, match.codeObject);
              collectCodeObject(codeObjectIds, match.codeObject);
            });
            priority.expandPattern(codeObjectPattern, Array.from(codeObjectIds));
            return appmaps;
          })
        );
    
        const firstSet = appmapsForRequiredPattern.pop();
        if (firstSet) {
          for (const appmap of firstSet) {
            if (appmapsForRequiredPattern.every((appmaps) => appmaps.has(appmap))) {
              appmaps.add(appmap);
            }
          }
        }
      } else {
        await processFiles(
          appmapDir,
          '.appmap.json',
          (file: string, cb: () => void) => {
            appmaps?.add(file.slice(0, file.length - '.appmap.json'.length));
            cb();
          },
          new ProcessFileOptions()
        );
      }
    
      return {
        appmaps: [...appmaps].sort(),
        specification: new Specification(priority, includedCodeObjectIds, requiredCodeObjectIds),
      };
    }
    
    function collectCodeObject(codeObjectIds: Set<CodeObjectId>, codeObject: CodeObject): void {
      codeObjectIds.add(codeObject.fqid);
    }

Summary:

This enhancement ensures that sequence diagrams, vital for understanding system behavior, are always included in the Navie context, while still allowing the inclusion of necessary code snippets by strategically managing token limits.