rgarfield11 / text_chunker_ex

A library for semantically coherent text chunking
MIT License
0 stars 0 forks source link

User - Text Chunker Website - Interactive Testing Environment #5

Open rgarfield11 opened 3 months ago

rgarfield11 commented 3 months ago

Background

This library is written; but it would be really cool if a website existed where users could test it out first.

Acceptance Criteria

Scenario: Users can perform text chunking on the website

Given I am a user on the Text Chunker Website

rgarfield11 commented 3 months ago

Based on the code and context provided, the Text Chunker is a library written in Elixir, designed for splitting text into chunks based on certain parameters such as chunk size, overlap, and format. To create an interactive testing environment for this on a website, we would need to do the following:

  1. Develop a web-based front-end through which users can input text and set chunking parameters.
  2. Set up a back-end service, likely leveraging the Phoenix Framework, to receive data from the front-end and use the Chunker.TextChunker module to process the text.
  3. Display the results back to the user in an easily digestible format on the website.

Since Elixir is not a common language for frontend development, we'd likely use JavaScript for the frontend. Our Elixir application would provide an API endpoint to process the chunking requests.

Here is an example of how the Elixir backend might provide an API endpoint using Phoenix Framework, assuming this application grows to include Phoenix:

# First, add Phoenix to deps in mix.exs
defp deps do
  [
    # ... existing deps
    {:phoenix, "~> 1.5.7"},
  ]
end

# lib/text_chunker_web/controllers/chunk_controller.ex
defmodule TextChunkerWeb.ChunkController do
  use TextChunkerWeb, :controller
  alias Chunker.TextChunker

  def create(conn, %{"text" => text, "options" => options}) do
    # Parse chunking options and pass them to the TextChunker
    opts = parse_options(options)
    chunked_data = TextChunker.split(text, opts)

    conn
    |> put_status(:ok)
    |> json(chunked_data)
  end

  defp parse_options(options) do
    # Options parsing logic here
    # Return options in the proper keyword list format expected by TextChunker.split/2
    # Note: Be cautious of user input and validate/sanitize accordingly
    # e.g., converting string values to integers
  end
end

# lib/text_chunker_web/router.ex
defmodule TextChunkerWeb.Router do
  use TextChunkerWeb, :router

  pipeline :api do
    plug :accepts, ["json"]
  end

  scope "/api", TextChunkerWeb do
    pipe_through :api
    post "/chunks", ChunkController, :create
  end
end

For the frontend, you could use any modern framework such as React, Vue.js, or Angular. However, here is a simple HTML format with JavaScript (using the Fetch API) for interaction:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Text Chunker</title>
</head>
<body>
    <h1>Text Chunker</h1>
    <textarea id="textInput" rows="10" cols="50" placeholder="Enter your text here"></textarea>
    <br>
    <input type="number" id="chunkSize" placeholder="Chunk Size">
    <input type="number" id="chunkOverlap" placeholder="Chunk Overlap">
    <button id="chunkButton">Chunk Text</button>
    <div id="chunkOutput"></div>

    <script>
        document.getElementById('chunkButton').addEventListener('click', function() {
            const text = document.getElementById('textInput').value;
            const chunkSize = document.getElementById('chunkSize').value;
            const chunkOverlap = document.getElementById('chunkOverlap').value;

            fetch('/api/chunks', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                },
                body: JSON.stringify({
                    text: text,
                    options: {
                        chunk_size: parseInt(chunkSize, 10),
                        chunk_overlap: parseInt(chunkOverlap, 10)
                    }
                })
            })
            .then(response => response.json())
            .then(data => {
                document.getElementById('chunkOutput').innerText = JSON.stringify(data, null, 2);
            })
            .catch((error) => {
                console.error('Error:', error);
            });
        });
    </script>
</body>
</html>

This code covers most of the acceptance criteria. To ensure the chunking is done within a reasonable response time, you may need to optimize the backend code, possibly with caching or improving the chunking algorithm's efficiency for large documents if necessary.

Lastly, before deploying this interactive testing environment to production, thorough testing must be done to ensure reliability and security of the processing, especially with user-provided input data.

created by stuart.page+demos@revelry.co using Prodops