Suggestions to track down inconsistent responses from claude-instance-v1

shafkevi / lambda-bedrock-s3-streaming-rag

Fully serverless streaming RAG application

MIT No Attribution

25 stars 6 forks source link

Suggestions to track down inconsistent responses from claude-instance-v1 #2

Open Analect opened 4 months ago

Analect commented 4 months ago

@shafkevi @giusedroid ... thanks for these two sister projects to explore a serverless rag approach leveraging bedrock with lancedb. I watched your session on the AWS Innovate event and decided to explore them further. There is a good mix of approaches across both repos that present different ways to combine these technologies using both local and remote (lambda-based) components. I'll cover it off in some questions below, but bringing the best parts of both projects into one master one would be great.

Data Ingest

Rather than the 'bedrock-docs' example, I decided to use the EU AI Act, a circa 250 page PDF to test how this approach works.

I think this screenshot below as related to the ingestion from Lambda kicked-off by a file being added to S3 in the sister project. It seems it was taking in chunk sizes bigger than the 1000 specified and perhaps it wasn't capturing generating embeddings for the full chunk. Did you experience anything similar?

aws-rag-1-sam-lambda-invocation2

Getting a Session Token

I'm more used to just supplying a key and secret-key as credentials. This is how I went about getting a session-token aws sts get-session-token --duration-seconds 43200 --profile default. Is there a better way? Eventhough this should last me 12 hours, I was sometimes getting a token-expired message what seemed like well before that window closed. Not sure why, but it required me to generate a new token each time.

Token expired response: aws-rag-2-react-app-response-4-token-expired

Inconsistent Responses

The response I was getting from Claude Instant I felt were not that impressive, so I was trying to understand the best approach to peer under the hood a bit more in terms of what was happening. It seems to me that not that much context was being returned from lancedb to feed the prompt to Claude.

However, there were also instances where I was re-asking the same questions (as per below) and the responses were inconsistent - some short, some longer ... not necessarily in any given order. Did either of you experience any of these problems when experimenting?

Worst response: aws-rag-2-react-app-response-4c

Poor response: aws-rag-2-react-app-response-4b

Best response: aws-rag-2-react-app-response-4

Questions

How to you suggest I might check that the whole document was properly embedded?
Are there good ways to capture what lancedb is returning as context to a query and passing to Claude?
Is there a way of seeing what Claude response were directly on AWS infrastructure, so I can isolate where this cutting-off of the streaming response is happening, and whether it relates to the lambda-lancedb coupling and cold-start issues.
I haven't yet tried hooking up the more powerful (more expensive) Claude model .. I will do that when I have the above resolved.
To combine the convenience of the sister-project Lambda linked to an S3 bucket, is it a case of just modifying the sam template.yaml to add in that part to this project so that 2 x lambdas sit in the solution. One for data-ingestion and one for data-querying?

Thanks for your guidance. Colum

giusedroid commented 4 months ago

Thank you so much for taking the time to test this out.

I'll start replying to the most immediate question. Please stay tuned for the others!

Getting a session token

I usually use AWS Amplify and Cognito Identity Pools. I am pasting some code below to illustrate what I mean with a sample react app, then I'll add some links to get started.

import { useState, useEffect } from 'react'

// (1) import components from Amplify UI
import {
  Authenticator,
  Flex,
  Button,
  Input,
  Divider,
  useTheme
}
from '@aws-amplify/ui-react';

// (2) import the StorageManager component 
import { StorageManager } from '@aws-amplify/ui-react-storage';

// (3) get basic styles from Amplify UI
import '@aws-amplify/ui-react/styles.css';

import { Amplify, Auth } from 'aws-amplify';
import awsExports from './aws-exports';
Amplify.configure(awsExports);

import { fetchEventSource } from "@microsoft/fetch-event-source";
import { SignatureV4 } from "@smithy/signature-v4";
import { Sha256 } from "@aws-crypto/sha256-js";

import '@aws-amplify/ui-react/styles.css';

function Header({ signOut, user }) {

  const { tokens } = useTheme();

  return (
    <Flex
        backgroundColor={tokens.colors.purple[60]}
        direction="row"
        justifyContent="space-between"
    >
        <h3>Hello {user.attributes.email}!</h3>
        <Button 
        colorTheme="overlay"
        onClick={signOut}>Sign out</Button>
    </Flex>
  );
}

function GenAI() {

  const { tokens } = useTheme();

  const [creds, setCreds] = useState({})

  const [searchQuery, setSearchQuery] = useState();
  const [chat, setChat] = useState([]);

  useEffect(() => {
    Auth.currentCredentials().then(setCreds)
  })

  const streamData = async () => {

    setChat([]);

    const sigv4 = new SignatureV4({
      service: "lambda",
      region: creds.identityId.split(":")[0],
      credentials: creds,
      sha256: Sha256
    });

    // INSERT YOUR LAMBDA URL HERE
    const apiUrl = new URL("<LAMBDA STREAMING URL HERE>");

    const query = document.getElementById("searchQuery").value;

    setSearchQuery(query);

    let body = JSON.stringify({
      query: query,
      // Can use any Bedrock available models
      model: "anthropic.claude-instant-v1",
      // This is required for the @microsoft/fetch-event-source library to understand the streaming response
      streamingFormat: "fetch-event-source"
    });

    let signed = await sigv4.sign({
      body,
      method: "POST",
      hostname: apiUrl.hostname,
      path: apiUrl.pathname.toString(),
      protocol: apiUrl.protocol,
      headers: {
        "Content-Type": "application/json",
        host: apiUrl.hostname
      }
    });

    await fetchEventSource(apiUrl.origin, {
      method: signed.method,
      headers: signed.headers,
      body,
      onopen(res) {
        if (res.ok && res.status === 200) {
          console.log("Connection made ", res);
        } else if (
          res.status >= 400 &&
          res.status < 500 &&
          res.status !== 429
        ) {
          console.log("Client-side error ", res);
        }
      },
      onmessage(event) {
        setChat((data) => [...data, event.data]);
        // Important to set the data this way, otherwise old data may be overwritten if the stream is too fast
      },
      onclose() {
        console.log("Connection closed by the server");
      },
      onerror(err) {
        console.log("There was an error from server", err);
      }
    });
  }

  return (
    <>
      <div>
        <Input
          id="searchQuery"
          placeholder="ask a question"
          fontWeight={tokens.fontWeights.bold}
          fontSize={tokens.fontSizes.xl}
          padding="xl"
          textColor={tokens.colors.purple[60]}
          border={`1px solid ${tokens.colors.purple[60]}`}
        />
      </div>
      <div>
        <Button  variant="primary" onClick={() => streamData()}>Submit Question</Button>
      </div>
      <div className="qa_container">
        <div>
          <b>Question:</b> {searchQuery}
        </div>
        <div>
          <b>Response:</b> {chat.join("")}
        </div>
      </div>
    </>
  )
}

function MainComponent({ signOut, user }) {
  return (
    <Flex direction="column">
      <Header user={user} signOut={signOut} backgroundcolor="purple.80"/>
      <Divider
        label="RAG Source - LanceDB backed by S3"
        size="large"
        orientation="horizontal" />
      <Flex 
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
        >
          <StorageManager
            acceptedFileTypes={[
              // you can list file extensions:
              '.pdf',
              // or MIME types:
            ]}
            accessLevel="private"
            maxFileCount={5}
            // Size is in bytes
            maxFileSize={10000000}
          />
      </Flex>
      <Divider
        label="LLM Prompt - Amazon Bedrock"
        size="large"
        orientation="horizontal" />
      <Flex
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
      >
        <GenAI />
      </Flex>
    </Flex>
  )
}

export default function App() {
  return (

    <Authenticator>
      {({signOut, user}) => <MainComponent user={user} signOut={signOut} />}
    </Authenticator>

  );
}

Make sure you have configured authentication and pulled your amplify configs.
For this sample to work, you must have an aws-exports.json file in the src/ folder.
You'll get this by creating an Amplify application and pulling it into this repo.
Here's how to get started with Amplify and it's authentication system:

https://docs.amplify.aws/react/start/getting-started/ https://docs.amplify.aws/react/build-a-backend/auth/set-up-auth/

my package.json

{
  "name": "rag-app",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "vite build",
    "lint": "eslint . --ext js,jsx --report-unused-disable-directives --max-warnings 0",
    "preview": "vite preview"
  },
  "dependencies": {
    "@aws-amplify/ui-react": "^5.3.1",
    "@aws-amplify/ui-react-storage": "^2.3.1",
    "@aws-crypto/sha256-js": "5.2.0",
    "@microsoft/fetch-event-source": "2.0.1",
    "@smithy/signature-v4": "2.0.12",
    "aws-amplify": "^5.3.11",
    "loader-utils": "^3.2.1",
    "react": "18.2.0",
    "react-dom": "18.2.0"
  },
  "devDependencies": {
    "@types/react": "^18.2.15",
    "@types/react-dom": "^18.2.7",
    "@vitejs/plugin-react": "^4.0.3",
    "eslint": "^8.45.0",
    "eslint-plugin-react": "^7.32.2",
    "eslint-plugin-react-hooks": "^4.6.0",
    "eslint-plugin-react-refresh": "^0.4.3",
    "vite": "^4.4.5"
  }
}

Vite config --> ./vite.config.js

import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'

// https://vitejs.dev/config/
export default defineConfig({
  plugins: [react()],
  server:{port: 8080},
  define:{global:{}}
})

giusedroid commented 4 months ago

It should render something like this. If you want I can post the CSS stuff too :D

Analect commented 4 months ago

Hmm. Thanks for screenshot above ... and the other detail ref Amplify. I was just getting a bare-bones react app rendered like below, without any drag-drop capability, per the gif on the project. I followed the README in testing folder.

In data-pipeline/ingest.py, could other meta-data get captured on LanceDB, such as page number of some way of linking back to the original document for context?

schema = pa.schema(
  [
      pa.field("vector", pa.list_(pa.float32(), 1536)), # document vector with 1.5k dimensions (TitanEmbedding)
      pa.field("text", pa.string()), # langchain requires it
      pa.field("id", pa.string()) # langchain requires it
  ])

I noticed in the aws sync, you are just synching back the embeddings to the s3 bucket rather than the docs too. It would be interesting to show links in the response back to items of context that were fed to the LLM.

Have you thought about whether the LanceDB can be used (different table) to give the Q&A on the GUI some memory? ... and capture LLM responses, for troubleshooting / refinement?

giusedroid commented 4 months ago

yup! You can do this

    schema = pa.schema(
        [
            pa.field("vector", pa.list_(pa.float32(), 1536)),
            pa.field("text", pa.string()),
            pa.field("id", pa.string()),
            pa.field("source", pa.string()), # document source
            pa.field("page", pa.string()) # page
        ]
    )

giusedroid commented 4 months ago

For memory, we have seen approaches using DynamoDB and defining "Sessions". https://github.com/aws-samples/aws-genai-llm-chatbot/blob/main/lib/chatbot-api/chatbot-dynamodb-tables/index.ts

Analect commented 4 months ago

@giusedroid ... I haven't used Amplify before. I was going to look at implementing your code but before I do, there is a suggestion on the Amplify docs site that a new version code-first DX (Gen 2) exists. Will your code work with this, or should I use Gen 1 legacy version?

In terms of code structure, I've added a new folder react_udpated below. In terms of file positioning, does that look right? Should I also have an index.js and styles.css under /src? Do I also need a public/index.html file?

Thanks for your guidance. Colum

lambda-bedrock-s3-streaming-rag/testing$ tree . -L 3 -I node_modules
.
├── event.json
├── react
│   ├── package.json
│   ├── package-lock.json
│   ├── public
│   │   └── index.html
│   └── src
│       ├── App.js
│       ├── index.js
│       └── styles.css
├── react_updated  <-- Amplify-driven version
│   ├── package.json <-- added from code shared above
│   └── .vite.config.js <-- should that go here?
│   └── src
│       ├── App.js <-- copied code from above
│       └── aws-exports.json <-- will add from amplify
├── README.md
├── test-no-auth.sh
└── test-with-auth.sh

giusedroid commented 4 months ago

not sure! I built that before Gen 2 was a thing :( If you'll use the "Legacy" version of amplify it should work. Unfortunately I haven't tested with the new version :(

giusedroid commented 4 months ago

and yes, the .vite config is in the right place, but you should get one if you follow this tutorial: https://docs.amplify.aws/react/start/getting-started/ and https://docs.amplify.aws/react/start/getting-started/setup/

Analect commented 4 months ago

Thanks. I'll give it a go.

Analect commented 3 months ago

@giusedroid ... I have this almost working, but I'm stuck on something related to Auth. I think it might be related to me using a slightly more up-to-date version of amplify or react libraries (see package.json lower down). I think my problem aws-amplify does not provide an export named Auth might be related to this SO, which admittedly pertains to Vue, but could be hinting at a related problem.

So I don't think this import of Auth below is effective any longer. Any thoughts on how I might fix this? Sorry ... I'm not very well schooled in these front-end frameworks. Thanks.

import { Amplify, Auth } from 'aws-amplify';
...

  useEffect(() => {
    Auth.currentCredentials().then(setCreds)
  })

{
  "name": "amplify-serverless-rag",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "vite build",
    "lint": "eslint . --ext js,jsx --report-unused-disable-directives --max-warnings 0",
    "preview": "vite preview"
  },
  "dependencies": {
    "@aws-amplify/ui-react": "^6.1.6",
    "@aws-amplify/ui-react-storage": "^3.0.16",
    "@aws-crypto/sha256-js": "^5.2.0",
    "@microsoft/fetch-event-source": "^2.0.1",
    "@smithy/signature-v4": "^2.2.0",
    "aws-amplify": "^6.0.20",
    "loader-utils": "^3.2.1",
    "react": "^18.2.0",
    "react-dom": "^18.2.0"
  },
  "devDependencies": {
    "@types/react": "^18.2.64",
    "@types/react-dom": "^18.2.21",
    "@vitejs/plugin-react": "^4.2.1",
    "eslint": "^8.57.0",
    "eslint-plugin-react": "^7.34.0",
    "eslint-plugin-react-hooks": "^4.6.0",
    "eslint-plugin-react-refresh": "^0.4.5",
    "vite": "^5.1.6"
  }
}

giusedroid commented 3 months ago

@Analect do you want to have a call say this afternoon? I'm free at 2pm GMT, can send you call details via LinkedIn?

giusedroid commented 3 months ago

some progress to port this to v6.

import { useState, useEffect } from 'react'

// (1) import components from Amplify UI
import {
  Authenticator,
  withAuthenticator,
  Flex,
  Button,
  Input,
  Divider,
  useTheme
}
from '@aws-amplify/ui-react';

// (2) import the StorageManager component 
import { StorageManager } from '@aws-amplify/ui-react-storage';

import { fetchAuthSession } from 'aws-amplify/auth';

// (3) get basic styles from Amplify UI
import '@aws-amplify/ui-react/styles.css';

import { Amplify } from 'aws-amplify';
// import {something} from 'aws-amplify/auth'
import amplifyconfig from './amplifyconfiguration.json';
Amplify.configure(amplifyconfig);

import { fetchEventSource } from "@microsoft/fetch-event-source";
import { SignatureV4 } from "@smithy/signature-v4";
import { Sha256 } from "@aws-crypto/sha256-js";

import '@aws-amplify/ui-react/styles.css';

function Header({ signOut, user }) {

  const { tokens } = useTheme();

  return (
    <Flex
        backgroundColor={tokens.colors.purple[60]}
        direction="row"
        justifyContent="space-between"
    >
        <h3>Hello {user.signInDetails.loginId}!</h3>
        <Button 
        colorTheme="overlay"
        onClick={signOut}>Sign out</Button>
    </Flex>
  );
}

function GenAI({user}) {

  const { tokens } = useTheme();

  const [creds, setCreds] = useState({})

  const [searchQuery, setSearchQuery] = useState();
  const [chat, setChat] = useState([]);

  useEffect( () => {
    console.log("Genai component");
    console.log(user);

    // fetchAuthSession().then(setCreds);

    // Auth.currentCredentials().then(setCreds)
  })

  const streamData = async () => {

    setChat([]);

    const sigv4 = new SignatureV4({
      service: "lambda",
      region: creds.identityId.split(":")[0],
      credentials: creds,
      sha256: Sha256
    });

    const apiUrl = new URL("YOUR ENDPOINT HERE");

    const query = document.getElementById("searchQuery").value;

    setSearchQuery(query);

    let body = JSON.stringify({
      query: query,
      // Can use any Bedrock available models
      model: "anthropic.claude-instant-v1",
      // This is required for the @microsoft/fetch-event-source library to understand the streaming response
      streamingFormat: "fetch-event-source"
    });

    let signed = await sigv4.sign({
      body,
      method: "POST",
      hostname: apiUrl.hostname,
      path: apiUrl.pathname.toString(),
      protocol: apiUrl.protocol,
      headers: {
        "Content-Type": "application/json",
        host: apiUrl.hostname
      }
    });

    await fetchEventSource(apiUrl.origin, {
      method: signed.method,
      headers: signed.headers,
      body,
      onopen(res) {
        if (res.ok && res.status === 200) {
          console.log("Connection made ", res);
        } else if (
          res.status >= 400 &&
          res.status < 500 &&
          res.status !== 429
        ) {
          console.log("Client-side error ", res);
        }
      },
      onmessage(event) {
        setChat((data) => [...data, event.data]);
        // Important to set the data this way, otherwise old data may be overwritten if the stream is too fast
      },
      onclose() {
        console.log("Connection closed by the server");
      },
      onerror(err) {
        console.log("There was an error from server", err);
      }
    });
  }

  return (
    <>
      <div>
        <Input
          id="searchQuery"
          placeholder="ask a question"
          fontWeight={tokens.fontWeights.bold}
          fontSize={tokens.fontSizes.xl}
          padding="xl"
          textColor={tokens.colors.purple[60]}
          border={`1px solid ${tokens.colors.purple[60]}`}
        />
      </div>
      <div>
        <Button  variant="primary" onClick={() => streamData()}>Submit Question</Button>
      </div>
      <div className="qa_container">
        <div>
          <b>Question:</b> {searchQuery}
        </div>
        <div>
          <b>Response:</b> {chat.join("")}
        </div>
      </div>
    </>
  )
}

function MainComponent({ signOut, user }) {
  return (
    <Flex direction="column">
      <Header user={user} signOut={signOut} backgroundcolor="purple.80"/>
      <Divider
        label="RAG Source - LanceDB backed by S3"
        size="large"
        orientation="horizontal" />
      <Flex 
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
        >
          <StorageManager
            acceptedFileTypes={[
              // you can list file extensions:
              '.pdf',
              // or MIME types:
            ]}
            accessLevel="private"
            maxFileCount={5}
            // Size is in bytes
            maxFileSize={10000000}
          />
      </Flex>
      <Divider
        label="LLM Prompt - Amazon Bedrock"
        size="large"
        orientation="horizontal" />
      <Flex
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
      >
        <GenAI user={user}/>
      </Flex>
    </Flex>
  )
}

// export default function App() {
//   return (

//     <Authenticator>
//       {({signOut, user}) => <MainComponent user={user} signOut={signOut} />}
//     </Authenticator>

//   );
// }

export default withAuthenticator(MainComponent)

giusedroid commented 3 months ago

import { useState, useEffect } from 'react'

// (1) import components from Amplify UI
import {
  Authenticator,
  withAuthenticator,
  Flex,
  Button,
  Input,
  Divider,
  useTheme
}
from '@aws-amplify/ui-react';

// (2) import the StorageManager component 
import { StorageManager } from '@aws-amplify/ui-react-storage';

import { fetchAuthSession } from 'aws-amplify/auth';

// (3) get basic styles from Amplify UI
import '@aws-amplify/ui-react/styles.css';

// amplify v6
import { Amplify } from 'aws-amplify';
import amplifyconfig from './amplifyconfiguration.json';
Amplify.configure(amplifyconfig);

import { fetchEventSource } from "@microsoft/fetch-event-source";
import { SignatureV4 } from "@smithy/signature-v4";
import { Sha256 } from "@aws-crypto/sha256-js";

import '@aws-amplify/ui-react/styles.css';

function Header({ signOut, user }) {

  const { tokens } = useTheme();

  return (
    <Flex
        backgroundColor={tokens.colors.purple[60]}
        direction="row"
        justifyContent="space-between"
    >
        <h3>Hello {user.signInDetails.loginId}!</h3>
        <Button 
        colorTheme="overlay"
        onClick={signOut}>Sign out</Button>
    </Flex>
  );
}

function GenAI({user}) {

  const { tokens } = useTheme();

  const [creds, setCreds] = useState({})

  const [searchQuery, setSearchQuery] = useState();
  const [chat, setChat] = useState([]);

  useEffect(() => {

    async function getSession(){

      const {credentials, identityId} = await fetchAuthSession();

      const {
        accessKeyId,
        secretAccessKey,
        sessionToken
      } = credentials;

      setCreds({
        accessKeyId,
        secretAccessKey,
        sessionToken,
        identityId
      });

      console.log(credentials);

    }

    getSession();

  }, [])

  const streamData = async () => {

    setChat([]);

    const sigv4 = new SignatureV4({
      service: "lambda",
      region: creds.identityId.split(":")[0],
      credentials: creds,
      sha256: Sha256
    });

    const apiUrl = new URL("YOUR URL HERE");

    const query = document.getElementById("searchQuery").value;

    setSearchQuery(query);

    let body = JSON.stringify({
      query: query,
      // Can use any Bedrock available models
      model: "anthropic.claude-instant-v1",
      // This is required for the @microsoft/fetch-event-source library to understand the streaming response
      streamingFormat: "fetch-event-source"
    });

    let signed = await sigv4.sign({
      body,
      method: "POST",
      hostname: apiUrl.hostname,
      path: apiUrl.pathname.toString(),
      protocol: apiUrl.protocol,
      headers: {
        "Content-Type": "application/json",
        host: apiUrl.hostname
      }
    });

    await fetchEventSource(apiUrl.origin, {
      method: signed.method,
      headers: signed.headers,
      body,
      onopen(res) {
        if (res.ok && res.status === 200) {
          console.log("Connection made ", res);
        } else if (
          res.status >= 400 &&
          res.status < 500 &&
          res.status !== 429
        ) {
          console.log("Client-side error ", res);
        }
      },
      onmessage(event) {
        setChat((data) => [...data, event.data]);
        // Important to set the data this way, otherwise old data may be overwritten if the stream is too fast
      },
      onclose() {
        console.log("Connection closed by the server");
      },
      onerror(err) {
        console.log("There was an error from server", err);
      }
    });
  }

  return (
    <>
      <div>
        <Input
          id="searchQuery"
          placeholder="ask a question"
          fontWeight={tokens.fontWeights.bold}
          fontSize={tokens.fontSizes.xl}
          padding="xl"
          textColor={tokens.colors.purple[60]}
          border={`1px solid ${tokens.colors.purple[60]}`}
        />
      </div>
      <div>
        <Button  variant="primary" onClick={() => streamData()}>Submit Question</Button>
      </div>
      <div className="qa_container">
        <div>
          <b>Question:</b> {searchQuery}
        </div>
        <div>
          <b>Response:</b> {chat.join("")}
        </div>
      </div>
    </>
  )
}

function MainComponent({ signOut, user }) {
  return (
    <Flex direction="column">
      <Header user={user} signOut={signOut} backgroundcolor="purple.80"/>
      <Divider
        label="RAG Source - LanceDB backed by S3"
        size="large"
        orientation="horizontal" />
      <Flex 
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
        >
          <StorageManager
            acceptedFileTypes={[
              // you can list file extensions:
              '.pdf',
              // or MIME types:
            ]}
            accessLevel="private"
            maxFileCount={5}
            // Size is in bytes
            maxFileSize={10000000}
          />
      </Flex>
      <Divider
        label="LLM Prompt - Amazon Bedrock"
        size="large"
        orientation="horizontal" />
      <Flex
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
      >
        <GenAI user={user}/>
      </Flex>
    </Flex>
  )
}

export default withAuthenticator(MainComponent)

We should be back in business, now onto the deployment of the S3 + lambda

giusedroid commented 3 months ago

added a select to chose a model, provided that you create a file called appconfig.json in ./src as follows

appconfig.json

{
    "models":[
    { "label": "amazon.titan-tg1-large", "value": "amazon.titan-tg1-large" },
    { "label": "amazon.titan-text-lite-v1", "value": "amazon.titan-text-lite-v1" },
    { "label": "amazon.titan-text-express-v1", "value": "amazon.titan-text-express-v1" },
    { "label": "ai21.j2-grande-instruct", "value": "ai21.j2-grande-instruct" },
    { "label": "ai21.j2-jumbo-instruct", "value": "ai21.j2-jumbo-instruct" },
    { "label": "ai21.j2-mid", "value": "ai21.j2-mid" },
    { "label": "ai21.j2-mid-v1", "value": "ai21.j2-mid-v1" },
    { "label": "ai21.j2-ultra", "value": "ai21.j2-ultra" },
    { "label": "ai21.j2-ultra-v1", "value": "ai21.j2-ultra-v1" },
    { "label": "anthropic.claude-instant-v1", "value": "anthropic.claude-instant-v1" },
    { "label": "anthropic.claude-v1", "value": "anthropic.claude-v1" },
    { "label": "anthropic.claude-v2", "value": "anthropic.claude-v2" },
    { "label": "cohere.command-text-v14", "value": "cohere.command-text-v14" },
    { "label": "cohere.command-light-text-v14", "value": "cohere.command-light-text-v14" }
  ]

}

App.jsx

import { useState, useEffect } from 'react'

// (1) import components from Amplify UI
import {
  withAuthenticator,
  Flex,
  Button,
  Input,
  Divider,
  SelectField,
  useTheme
}
from '@aws-amplify/ui-react';

// (2) import the StorageManager component 
import { StorageManager } from '@aws-amplify/ui-react-storage';

import { fetchAuthSession } from 'aws-amplify/auth';

// (3) get basic styles from Amplify UI
import '@aws-amplify/ui-react/styles.css';

// amplify v6
import { Amplify } from 'aws-amplify';
import amplifyconfig from './amplifyconfiguration.json';
Amplify.configure(amplifyconfig);

import {models} from './appconfig.json';

import { fetchEventSource } from "@microsoft/fetch-event-source";
import { SignatureV4 } from "@smithy/signature-v4";
import { Sha256 } from "@aws-crypto/sha256-js";

import '@aws-amplify/ui-react/styles.css';

function Header({ signOut, user }) {

  const { tokens } = useTheme();

  return (
    <Flex
        backgroundColor={tokens.colors.purple[60]}
        direction="row"
        justifyContent="space-between"
    >
        <h3>Hello {user?.signInDetails?.loginId ? user.signInDetails.loginId : "uh?!"}!</h3>
        <Button 
        colorTheme="overlay"
        onClick={signOut}>Sign out</Button>
    </Flex>
  );
}

function GenAI({user, models}) {

  const { tokens } = useTheme();

  const [creds, setCreds] = useState({})

  const [searchQuery, setSearchQuery] = useState();
  const [chat, setChat] = useState([]);
  const [model, setModel] = useState("anthropic.claude-instant-v1");

  useEffect(() => {

    async function getSession(){

      const {credentials, identityId} = await fetchAuthSession();

      const {
        accessKeyId,
        secretAccessKey,
        sessionToken
      } = credentials;

      setCreds({
        accessKeyId,
        secretAccessKey,
        sessionToken,
        identityId
      });

      console.log(credentials);

    }

    getSession();

  }, [])

  const streamData = async () => {

    setChat([]);

    const sigv4 = new SignatureV4({
      service: "lambda",
      region: creds.identityId.split(":")[0],
      credentials: creds,
      sha256: Sha256
    });

    const apiUrl = new URL("YOUR URL HERE");

    const query = document.getElementById("searchQuery").value;

    setSearchQuery(query);

    let body = JSON.stringify({
      query: query,
      // Can use any Bedrock available models
      model: model,
      // This is required for the @microsoft/fetch-event-source library to understand the streaming response
      streamingFormat: "fetch-event-source"
    });

    let signed = await sigv4.sign({
      body,
      method: "POST",
      hostname: apiUrl.hostname,
      path: apiUrl.pathname.toString(),
      protocol: apiUrl.protocol,
      headers: {
        "Content-Type": "application/json",
        host: apiUrl.hostname
      }
    });

    console.log(signed);

    await fetchEventSource(apiUrl.origin, {
      method: signed.method,
      headers: signed.headers,
      body,
      onopen(res) {
        if (res.ok && res.status === 200) {
          console.log("Connection made ", res);
        } else if (
          res.status >= 400 &&
          res.status < 500 &&
          res.status !== 429
        ) {
          console.log("Client-side error ", res);
        }
      },
      onmessage(event) {
        setChat((data) => [...data, event.data]);
        // Important to set the data this way, otherwise old data may be overwritten if the stream is too fast
      },
      onclose() {
        console.log("Connection closed by the server");
      },
      onerror(err) {
        console.log("There was an error from server", err);
      }
    });
  }

  return (
    <>
      <div>

        <SelectField
          defaultValue={model.value}
          onChange={(event) => {
            console.log(event.target.value);
            setModel(event.target.value);
          }}
          options={models.map(x => x.value)}
          label="Select LLM"
        >
        </SelectField>

        <Input
          id="searchQuery"
          placeholder="ask a question"
          fontWeight={tokens.fontWeights.bold}
          fontSize={tokens.fontSizes.xl}
          padding="xl"
          textColor={tokens.colors.purple[60]}
          border={`1px solid ${tokens.colors.purple[60]}`}
        />
      </div>
      <div>
        <Button variant="primary" onClick={() => streamData()}>Submit Question</Button>
      </div>
      <div className="qa_container">
        <div>
          <b>Question:</b> {searchQuery}
        </div>
        <div>
          <b>Response:</b> {chat.join("")}
        </div>
      </div>
    </>
  )
}

function MainComponent({ signOut, user }) {
  return (
    <Flex direction="column">
      <Header user={user} signOut={signOut} backgroundcolor="purple.80"/>
      <Divider
        label="RAG Source - LanceDB backed by S3"
        size="large"
        orientation="horizontal" />
      <Flex 
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
        >
          <StorageManager
            acceptedFileTypes={[
              // you can list file extensions:
              '.pdf',
              // or MIME types:
            ]}
            accessLevel="private"
            maxFileCount={5}
            // Size is in bytes
            maxFileSize={10000000}
          />
      </Flex>
      <Divider
        label="LLM Prompt - Amazon Bedrock"
        size="large"
        orientation="horizontal" />
      <Flex
        direction="column" 
        justifyContent="space-around"
        alignItems="center"
      >
        <GenAI user={user} models={models}/>
      </Flex>
    </Flex>
  )
}

export default withAuthenticator(MainComponent)

Analect commented 3 months ago

@giusedroid ... that's really great. Thanks. I'll have to switch to using Oregon to avail of these LLMs. In terms of embeddings, I think the set-up defaults to amazon.titan-embed-text-v1, is that correct? The list below was the config that gets output by that other chatbot-llm tool. In the other architecture you are working on, does that look to handle different embedding types and if so, have you thought about how that his handled in LanceDB ... is there a separate table per user-embedding pair?

"embeddingsModels": [
      {
        "provider": "sagemaker",
        "name": "intfloat/multilingual-e5-large",
        "dimensions": 1024
      },
      {
        "provider": "sagemaker",
        "name": "sentence-transformers/all-MiniLM-L6-v2",
        "dimensions": 384
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-text-v1",
        "dimensions": 1536,
        "default": true
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-image-v1",
        "dimensions": 1024
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-english-v3",
        "dimensions": 1024
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-multilingual-v3",
        "dimensions": 1024
      },
      {
        "provider": "openai",
        "name": "text-embedding-ada-002",
        "dimensions": 1536
      }
    ],

giusedroid commented 3 months ago

Yeah that's how I'd suggest to pair query, embedding model, and target search space.

My understanding is that there should be some sort of transformation that should bring from one space to another, but I am not sure what level of access to the vector apis we have with LanceDB.

To search across spaces with different underlying embedding you'd have to embed the query with the same embedding model. So per destination space, you'd have to keep track of its embedding.

Assuming that a transformation A between such spaces existed you'd have to do something like

q' = q^TA with q' the query in the new vector space. This should be less expensive than re-embedding the query. I've seen some approaches like that in literature. I'll post the link as soon as I can.

Analect commented 3 months ago

@giusedroid ... could you point me in the right direction when you have a moment. I recall you mentioning something about CORS, which I think I am hitting.

I updated SAM config to bring in document-processing lambda-function and associated s3 bucket, per your original repo. That appears to have worked fine.

I then ran amplify import storage (which allows you to specify a pre-existing bucket) and linked to the one created above and ran amplify push.

In App.jsx, I had initially updated config section to include bucket, but realised after amplify push that this got added to amplifyconfiguration.json ... so commented this back out again.

Amplify.configure({
  ...Amplify.getConfig(),
  Storage: {
    S3: {
      region: 'eu-central-1',
      bucket: 'sam-amplify-XXXX'
    }
  }
});

Lower down, there's this StorageManager part ... which, from my understanding (and the docs), is handling the file upload on my behalf.

<StorageManager
            acceptedFileTypes={[
              // you can list file extensions:
              '.pdf',
              // or MIME types:
            ]}
            accessLevel="private"
            maxFileCount={5}
            // Size is in bytes
            maxFileSize={10000000}
          />

Do you think the issue might have been my decision to use amplify import storage, since from here, the amplify add storage might have created some additional policies around user-permissioning to upload to that bucket.

giusedroid commented 3 months ago

@Analect try this out: https://stackoverflow.com/questions/31911898/configure-cors-response-headers-on-aws-lambda for CORS

giusedroid commented 3 months ago

@Analect if you import the storage you should see it populated in amplifyconfig.json

giusedroid commented 3 months ago

You'll have to have a look at the authenticated role for your Identity Pool in cognito. It should include permissions for your users to upload to S3.

https://docs.aws.amazon.com/cognito/latest/developerguide/identity-pools.html https://docs.aws.amazon.com/cognito/latest/developerguide/role-based-access-control.html

Analect commented 3 months ago

Thanks. Let me dig into this. Looking at https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_cognito-bucket.html ... was this how you are handling permissioning in your new system? Where users' document stores and vector-dbs are isolated by cognito username folders? That presumably would require modifying any Lambda functions referencing s3 ... like here to append a cognito username to a url? Is that done by appending user meta-data to a function call to Lambda?

giusedroid commented 3 months ago

I'd suggest you to create a new amplify app from scratch using the Amplify studio, add storage with read permission for authenticated users and copy the generated policy, mutatis mutandis ;) https://docs.amplify.aws/javascript/build-a-backend/storage/set-up-storage/

that policy will use session tags identifying the user via cognito:sub and will create a path structure on S3 so that only user A can access to s3://your-bucket/A/ subtree :)

giusedroid commented 3 months ago

That presumably would require modifying any Lambda functions referencing s3 ... like here to append a cognito username to a url? Is that done by appending user meta-data to a function call to Lambda?

Not necessarily. When the document is uploaded you'll receive the full path where it resides on S3 via the event that is triggering the lambda function. You can create an additional path under the same user, so that permissions will still apply.

upload file to --> s3://your-bucket/user-data/${congito:sub}/private/uploads/${filename}

lambda is triggered with an event including bucket and key

in lance you create a table under

s3://your-bucket/user-data/${cognito:sub}/private/embeddings/

giusedroid commented 3 months ago

Screenshot 2024-03-19 at 14 15 22 like so

giusedroid commented 3 months ago

then get to Cognito Identity Pools and find your app identity pool, tap on User Access, tap on the IAM role that is associated with the authenticated role Screenshot 2024-03-19 at 14 18 58

copy the private policy for yout authenticated identities

Screenshot 2024-03-19 at 14 22 32