add snapshots - Githubissues

lucgagan commented 1 year ago

Add a way to save instructions into a snapshot after the first run.

This would mean that the OpenAI API needs to be called only when:

running the test the first time
re-generating the test when the website has changed

I've used https://vitest.dev/guide/snapshot snapshots before, and they were a pleasure to work with. We could replicate equivalent behavior.

markwhitfeld commented 1 year ago

This would be amazing!!! This is a super interesting project, but it would become very costly to hit OpenAI every time. It would be critical to be able to snapshot the instructions that are auto-generated for each step. I'm thinking that this could be done via convention if your test file where the outer test is defined is called checkout.spec.ts then each auto-generated step could be snapshotted into a checkout.auto-spec.ts (or checkout.spec.auto.ts) file with a map of these code blocks. If checkout.spec.ts contains:

import { expect, test } from "@playwright/test";
import { auto } from "../src/auto";

test.describe("(grocery purchase)", () => {
  test("shopping cart should have total items", async ({ page }) => {
    await page.goto("/");

    await auto("open the groceries section", { page, test });

    await auto("add bacon to shopping cart", { page, test });
    await auto("add cheese to shopping cart", { page, test });
    await auto("add bacon to shopping cart", { page, test });
    await auto("add bacon to shopping cart", { page, test });

    const itemCount = await auto("get the shopping cart item count", { page, test });

    expect(itemCount ).toBe("4");
  });

  test("some other test", async ({ page }) => {
    // ...
    await auto("add bacon to shopping cart", { page, test });
    // ...
  });

  test("some other test not using auto", async ({ page }) => {
    // ... this test would not appear in the snapshots because it doesn't use `auto`
  });

});

checkout.auto-spec.ts would contain something like:

const tests = {};
// the text below would come from the full test name (including the `describe`, if present)
tests['(grocery purchase) shopping cart should have total items'] = {
  'open the groceries section' = [
    //Instruction snapshots are stored as an array to allow for the instruction to be called multiple times in the same test.
    ({page, test}) => {
      //auto-generated code goes here
    },
  ],
  'add bacon to shopping cart' = [
    ({page, test}) => {
      //auto-generated code goes here
    },
    ({page, test}) => {
      // auto-generated code goes here for the second time when the same instruction is hit in the test
      // even though it is the same instruction, it could be different depending on what the test has done
      // A later optimisation of the snapshots could check if the instructions are identical, and 
      //  then only store one variant in that case. An Idea here is that we could store the index 
      //  of the instruction to use instead, so that when reading, when we see a number we can redirect to that index.      
    },
    1, //  An example of the idea above to remove duplication. 
        // This number points to use the instructions above at index 1 of the array
  ],
  'add cheese to shopping cart' = [
    ({page, test}) => {
      //auto-generated code goes here
    },
  ],
  'get the shopping cart item count' = [
    ({page, test}) => {
      //auto-generated code goes here
    },
  ],
};

tests['(grocery purchase) some other test'] = {
  'add bacon to shopping cart' = [
    ({page, test}) => {
      //auto-generated code goes here
    },
};

export {
  tests
};

So then a call to auto in a test within the checkout.spec.ts file would first check:

is there a checkout.auto-spec.ts file? (and we are not in "record" mode)
- YES, then import that file and, from the tests export, check for the test by full name:
- IF FOUND, check for the instruction
  - IF FOUND then look up the snapshot by the index for this call
  - (the instruction index would be tracked internally by the lib for each unique instruction in the test)
  - If the instruction found is a number then we use that number as the index to find the instruction (a 'redirect' of sorts)
- ELSE (for any of the above)
- run the openAI code to generate the instruction set
- save the instruction set to the corresponding place in the snapshot file.
- execute the instruction set by invoking the process above to locate and execute the instructions

Just some of my musings on how this could work. You are welcome to poke at these ideas, and we can discuss them.

lucgagan commented 1 year ago

I've faced a few issues with trying to implement snapshots.

Consider a scenario where the instructions say "extract the first letter of the heading"

At the moment, the way this would work is:

> message {
  role: 'assistant',
  content: null,
  function_call: { name: 'locateElement', arguments: '{"cssSelector":"h1"}' }
}
> message {
  role: 'function',
  name: 'locateElement',
  content: '{"elementId":"3c00af66-536e-445b-8dfe-f4a48d980659"}'
}
> message {
  role: 'assistant',
  content: null,
  function_call: {
    name: 'locator_innerText',
    arguments: '{"elementId":"3c00af66-536e-445b-8dfe-f4a48d980659"}'
  }
}
> message {
  role: 'function',
  name: 'locator_innerText',
  content: '{"innerText":"Hello, Rayrun!"}'
}
+ > message {
+   role: 'assistant',
+   content: null,
+   function_call: { name: 'resultQuery', arguments: '{"query":"H"}' }
+ }
+ > message { role: 'function', name: 'resultQuery', content: '{"query":"H"}' }
> message {
  role: 'assistant',
  content: 'The first letter of the heading is "H".'
}

notice that OpenAI understood that we need to use locator_innerText to extract the heading text ("Hello, Rayrun!"), but then it used AI to extract just the first letter of the heading ("H"). There is no good way to store this into a snapshot. I am going to experiment with asking AI to generate JavaScript that would perform text manipulation if needed.

lucgagan commented 1 year ago

Indeed this seems to solve that issue. This brings us closer to snapshot support.

lucgagan / auto-playwright

add snapshots #1