This is a fullstack example of how to build a RAG (Retrieval Augmented Generation) app with Cloudflare. It uses Cloudflare Workers, Pages, D1, KV, R2, AI Gateway and Workers AI.
https://github.com/user-attachments/assets/cbaa0380-7ad6-448d-ad44-e83772a9cf3f
Features:
Make sure you have Node, pnpm and wrangler CLI installed.
Install dependencies:
pnpm install # or npm install
Deploy necessary primitives:
./setup.sh
Then, in wrangler.toml
, set the d1_databases.database_id
to your D1 database id and kv_namespaces.rate_limiter
to your rate limiter KV namespace id.
Then, create a .dev.vars
file with your API keys:
CLOUDFLARE_ACCOUNT_ID=your-cloudflare-account-id # Required
GROQ_API_KEY=your-groq-api-key # Optional
OPENAI_API_KEY=your-openai-api-key # Optional
ANTHROPIC_API_KEY=your-anthropic-api-key # Optional
If you don't have these keys, /api/stream
will fallback to Workers AI.
Run the dev server:
npm run dev
And access the app at http://localhost:5173/
.
Having the necessary primitives setup, first setup secrets:
npx wrangler secret put CLOUDFLARE_ACCOUNT_ID
npx wrangler secret put GROQ_API_KEY
npx wrangler secret put OPENAI_API_KEY
npx wrangler secret put ANTHROPIC_API_KEY
Then, deploy your app to Cloudflare Pages:
npm run deploy
This project uses a combination of classical Full Text Search (sparse) against Cloudflare D1 and Hybrid Search with embeddings against Vectorize (dense) to provide the best of both worlds providing the most applicable context to the LLM.
The way it works is this:
This project is licensed under the terms of the MIT License.
If you need help in building AI applications, please reach out to me on Twitter or via my website. Happy to help!