FastUI/README.md at main · pydantic/FastUI

Related issues

508: microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks.

### Details

Similarity score: 0.86 - [ ] [microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks.](https://github.com/microsoft/TaskWeaver) **CONTENT**: - **TITLE**: microsoft/TaskWeaver: A code-first agent framework for seamlessly planning and executing data analytics tasks. - **DESCRIPTION**: TaskWeaver is a code-first agent framework for seamlessly planning and executing data analytics tasks. This innovative framework interprets user requests through code snippets and efficiently coordinates a variety of plugins in the form of functions to execute data analytics tasks in a stateful manner. ## 🆕 News - 📅2024-02-01: TaskWeaver now has a plugin `document_retriever` for RAG based on a knowledge base.📚 - 📅2024-01-30: TaskWeaver introduces a new plugin-only mode that securely generates calls to specified plugins without producing extraneous code.🪡 - 📅2024-01-23: TaskWeaver can now be personalized by transforming your chat histories into enduring experiences.🎉 - 📅2024-01-17: TaskWeaver now has a plugin `vision_web_explorer` that can open a web browser and explore websites.🌐 - 📅2024-01-15: TaskWeaver now supports Streaming♒ in both UI and command line.✌️ ## 💥 Highlights - Rich data structure - TaskWeaver allows you to work with rich data structures in Python, such as DataFrames, instead of dealing with strings. - Customized algorithms - TaskWeaver allows you to encapsulate your own algorithms into plugins and orchestrate them. - Incorporating domain-specific knowledge - TaskWeaver is designed to incorporate domain-specific knowledge easily to improve the reliability. - Stateful execution - TaskWeaver is designed to support stateful execution of the generated code to ensure consistent and smooth user experience. - Code verification - TaskWeaver is designed to verify the generated code before execution. It can detect potential issues in the generated code and provide suggestions to fix them. - Easy to use - TaskWeaver is easy to use with sample plugins, examples and tutorials to help you get started. TaskWeaver offers an open-box experience, allowing users to run it immediately after installation. - Easy to debug - TaskWeaver is easy to debug with detailed and transparent logs to help you understand the entire process, including LLM prompts, the code generation, and execution process. - Security consideration - TaskWeaver supports a basic session management to keep different users' data separate. The code execution is separated into different processes to avoid mutal interference. - Easy extension - TaskWeaver is easy to extend to accomplish more complex tasks with multiple agents as the plugins. ## ✨ Quick Start ### 🛠️ Step 1: Installation TaskWeaver requires Python >= 3.10. It can be installed by running the following command: ```shell # [optional to create conda environment] # conda create -n taskweaver python=3.10 # conda activate taskweaver # clone the repository git clone https://github.com/microsoft/TaskWeaver.git cd TaskWeaver # install the requirements pip install -r requirements.txt ``` ### 🖊️ Step 2: Configure the LLMs Before running TaskWeaver, you need to provide your LLM configurations. Taking OpenAI as an example, you can configure the `taskweaver_config.json` file as follows. #### OpenAI ```json { "llm.api_key": "the api key", "llm.model": "the model name, e.g., gpt-4" } ``` 💡 TaskWeaver also supports other LLMs and advanced configurations, please check the documents for more details. ### 🚩 Step 3: Start TaskWeaver #### ⌨️ Command Line (CLI) Assume you are in the cloned TaskWeaver folder. ```shell python -m taskweaver -p ./project/ ``` This will start the TaskWeaver process and you can interact with it through the command line interface. If everything goes well, you will see the following prompt: ``` ========================================================= _____ _ _ __ |_ _|_ _ ___| | _ | | / /__ ____ __ _____ _____ | |/ _` / __| |/ /| | /| / / _ \/ __ `/ | / / _ \/ ___/ | | (_| \__ \ < | |/ |/ / __/ /_/ /| |/ / __/ / |_|\__,_|___/_|\_\|__/|__/\___/\__,_/ |___/\___/_/ ========================================================= TaskWeaver: I am TaskWeaver, an AI assistant. To get started, could you please enter your request? Human: ___ ``` or 💻 Web UI TaskWeaver also supports WebUI for demo purpose, please refer to web UI docs for more details. or 📋 Import as a Library TaskWeaver can be imported as a library to integrate with your existing project, more information can be found in docs 📖 Documentation More documentations can be found on TaskWeaver Website. [URL](https://github.com/microsoft/TaskWeaver) #### Suggested labels #### { "label-name": "taskweaver", "description": "A code-first agent framework for data analytics tasks.", "repo": "microsoft/TaskWeaver", "confidence": 68.7 }

487: fgmacedo/python-statemachine: Python Finite State Machines made easy.

### Details

Similarity score: 0.85 - [ ] [fgmacedo/python-statemachine: Python Finite State Machines made easy.](https://github.com/fgmacedo/python-statemachine) # fgmacedo/python-statemachine: Python Finite State Machines made easy. Python finite-state machines made easy. **Python StateMachine** - **Free software:** MIT license - **Documentation:** Welcome to python-statemachine, an intuitive and powerful state machine framework designed for a great developer experience. 🚀 With StateMachine, you can easily create complex, dynamic systems with clean, readable code. 💡 Our framework makes it easy to understand and reason about the different states, events and transitions in your system, so you can focus on building great products. 🔒 python-statemachine also provides robust error handling and ensures that your system stays in a valid state at all times. A few reasons why you may consider using it: 📈 python-statemachine is designed to help you build scalable, maintainable systems that can handle any complexity. 💪 You can easily create and manage multiple state machines within a single application. 🚫 Prevents common mistakes and ensures that your system stays in a valid state at all times. ## Getting started To install Python State Machine, run this command in your terminal: ``` pip install python-statemachine ``` To generate diagrams from your machines, you'll also need pydot and Graphviz. You can install this library already with pydot dependency using the extras install option. See our docs for more details. ``` pip install python-statemachine[diagrams] ``` ### Define your state machine: ```python from statemachine import StateMachine, State class TrafficLightMachine(StateMachine): "A traffic light machine" green = State(initial=True) yellow = State() red = State() cycle = ( green.to(yellow) | yellow.to(red) | red.to(green) ) def before_cycle(self, event: str, source: State, target: State, message: str = ""): message = ". " + message if message else "" return f"Running {event} from {source.id} to {target.id}{message}" def on_enter_red(self): print("Don't move.") def on_exit_red(self): print("Go ahead!") ``` You can now create an instance: ```python sm = TrafficLightMachine() ``` This state machine can be represented graphically as follows: ```python img_path = "docs/images/readme_trafficlightmachine.png" sm._graph().write_png(img_path) ``` ## URL - **Repository:** #### Suggested labels #### { "label-name": "state-machine-framework", "description": "A framework for creating complex, dynamic systems with clean, readable code.", "confidence": 85.89 }

625: unsloth/README.md at main · unslothai/unsloth

### Details

Similarity score: 0.85 - [ ] [unsloth/README.md at main · unslothai/unsloth](https://github.com/unslothai/unsloth/blob/main/README.md?plain=1) # unsloth/README.md at main · unslothai/unsloth

### Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory! ![](https://i.ibb.co/sJ7RhGG/image-41.png)

## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Gemma 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing) | 2.4x faster | 58% less | | **Mistral 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing) | 2.2x faster | 62% less | | **Llama-2 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/1lBzz5KeZJKXjvivbYvmGarix9Ao6Wxe5?usp=sharing) | 2.2x faster | 43% less | | **TinyLlama** | [▶️ Start on Colab](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing) | 3.9x faster | 74% less | | **CodeLlama 34b** A100 | [▶️ Start on Colab](https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing) | 1.9x faster | 27% less | | **Mistral 7b** 1xT4 | [▶️ Start on Kaggle](https://www.kaggle.com/code/danielhanchen/kaggle-mistral-7b-unsloth-notebook) | 5x faster\* | 62% less | | **DPO - Zephyr** | [▶️ Start on Colab](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) | 1.9x faster | 19% less | - This [conversational notebook](https://colab.research.google.com/drive/1Aau3lgPzeZKQ-98h69CCu1UJcvIBLmy2?usp=sharing) is useful for ShareGPT ChatML / Vicuna templates. - This [text completion notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing) is for raw text. This [DPO notebook](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) replicates Zephyr. - \* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## 🦥 Unsloth.ai News - 📣 [Gemma 7b](https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing) on 6T tokens now works. And [Gemma 2b notebook](https://colab.research.google.com/drive/15gGm7x_jTm017_Ic8e317tdIpDG53Mtu?usp=sharing) - 📣 Added [conversational notebooks](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing) and [raw text notebooks](https://colab.research.google.com/drive/1bMOKOBzxQWUIGZBs_B0zm8pimuEnZdfM?usp=sharing) - 📣 [2x faster inference](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) added for all our models - 📣 [DPO support](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) is now included. [More info](#DPO) on DPO - 📣 We did a [blog](https://huggingface.co/blog/unsloth-trl) with 🤗Hugging Face and are in their official docs! Check out the [SFT docs](https://huggingface.co/docs/trl/main/en/sft_trainer#accelerate-fine-tuning-2x-using-unsloth) and [DPO docs](https://huggingface.co/docs/trl/main/en/dpo_trainer#accelerate-dpo-fine-tuning-using-unsloth) - 📣 [Download models 4x faster](https://huggingface.co/collections/unsloth/) from 🤗Hugging Face. Eg: `unsloth/mistral-7b-bnb-4bit` ## 🔗 Links and Resources | Type | Links | | ------------------------------- | --------------------------------------- | | 📚 **Wiki & FAQ** | [Read Our Wiki](https://github.com/unslothai/unsloth/wiki) | | 📜 **Documentation** | [Read The Doc](https://github.com/unslothai/unsloth/tree/main#-documentation) | | 💾 **Installation** | [unsloth/README.md](https://github.com/unslothai/unsloth/tree/main#installation-instructions)| |

**Twitter (aka X)** | [Follow us on X](https://twitter.com/unslothai)| | 🥇 **Benchmarking** | [Performance Tables](https://github.com/unslothai/unsloth/tree/main#-performance-benchmarking) | 🌐 **Released Models** | [Unsloth Releases](https://huggingface.co/unsloth)| | ✍️ **Blog** | [Read our Blogs](https://unsloth.ai/blog)| ## ⭐ Key Features - All kernels written in [OpenAI's Triton](https://openai.com/research/triton) language. **Manual backprop engine**. - **0% loss in accuracy** - no approximation methods - all exact. - No change of hardware. Supports NVIDIA GPUs since 2018+. Minimum CUDA Capability 7.0 (V100, T4, Titan V, RTX 20, 30, 40x, A100, H100, L40 etc) [Check your GPU!](https://developer.nvidia.com/cuda-gpus) GTX 1070, 1080 works, but is slow. - Works on **Linux** and **Windows** via WSL. - Supports 4bit and 16bit QLoRA / LoRA finetuning via [bitsandbytes](https://github.com/TimDettmers/bitsandbytes). - Open source trains 5x faster - see [Unsloth Pro](https://unsloth.ai/) for **30x faster training**! - If you trained a model with 🦥Unsloth, you can use this cool sticker!

## 🥇 Performance Benchmarking - For the full list of **reproducable** benchmarking tables, [go to our website](https://unsloth.ai/blog/mistral-benchmark#Benchmark%20tables) | 1 A100 40GB | 🤗Hugging Face | Flash Attention | 🦥Unsloth Open Source | 🦥[Unsloth Pro](https://unsloth.ai/pricing) | |--------------|--------------|-----------------|---------------------|-----------------| | Alpaca | 1x | 1.04x | 1.98x | **15.64x** | | LAION Chip2 | 1x | 0.92x | 1.61x | **20.73x** | | OASST | 1x | 1.19x | 2.17x | **14.83x** | | Slim Orca | 1x | 1.18x | 2.22x | **14.82x** | - Benchmarking table below was conducted by [🤗Hugging Face](https://huggingface.co/blog/unsloth-trl). | Free Colab T4 | Dataset | 🤗Hugging Face | Pytorch 2.1.1 | 🦥Unsloth | 🦥 VRAM reduction | | --- | --- | --- | --- | --- | --- | | Llama-2 7b | OASST | 1x | 1.19x | 1.95x | -43.3% | | Mistral 7b | Alpaca | 1x | 1.07x | 1.56x | -13.7% | | Tiny Llama 1.1b | Alpaca | 1x | 2.06x | 3.87x | -73.8% | | DPO with Zephyr | Ultra Chat | 1x | 1.09x | 1.55x | -18.6% | ![](https://i.ibb.co/sJ7RhGG/image-41.png) [View on GitHub](https://github.com/unslothai/unsloth/blob/main/README.md?plain=1) #### Suggested labels ####

666: PygmalionAI/aphrodite-engine: PygmalionAI's large-scale inference engine

### Details

Similarity score: 0.85 - [ ] [PygmalionAI/aphrodite-engine: PygmalionAI's large-scale inference engine](https://github.com/PygmalionAI/aphrodite-engine) # PygmalionAI/aphrodite-engine: PygmalionAI's large-scale inference engine **DESCRIPTION:** "Aphrodite is the official backend engine for PygmalionAI. It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to FasterTransformer and vLLM). Aphrodite builds upon and integrates the exceptional work from various projects. The compute necessary for Aphrodite's development is provided by Arc Compute. **Features** - Continuous Batching - Efficient K/V management with PagedAttention - Optimized CUDA kernels for improved inference - Quantization support via GPTQ, GGUF, AWQ, QuIP#, and SqueezeLLM. - Distributed inference - Variety of sampling methods (Mirostat, Locally Typical Sampling, Tail-Free Sampling, etc) - 8-bit KV Cache for higher context lengths and throughput. **Quickstart** ``` pip install aphrodite-engine python -m aphrodite.endpoints.openai.api_server --model PygmalionAI/pygmalion-2-7b ``` **Caution** If the installation reports CUDA kernel errors, please run `pip install aphrodite-engine=0.4.5` instead. This will create a OpenAI-compatible API server that can be accessed at port 2242 of the localhost. You can plug in the API into a UI that supports Kobold, such as SillyTavern." **URL:** [https://github.com/PygmalionAI/aphrodite-engine](https://github.com/PygmalionAI/aphrodite-engine) #### Suggested labels #### {'label-name': 'blazing-fast-inference', 'label-description': 'Focuses on high-speed inference performance using technologies like FasterTransformer and vLLM.', 'confidence': 55.95}

134: marker: Convert PDF to markdown quickly with high accuracy

### Details

Similarity score: 0.84 - [ ] [https://github.com/VikParuchuri/marker#readme](https://github.com/VikParuchuri/marker#readme) ## Marker Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk. - Support for a range of PDF documents (optimized for books and scientific papers) - Removes headers/footers/other artifacts - Converts most equations to latex - Formats code blocks and tables - Support for multiple languages (although most testing is done in English). See `settings.py` for a language list. - Works on GPU, CPU, or MPS

More Details

## How it works Marker is a pipeline of deep learning models: - Extract text, OCR if necessary (heuristics, tesseract) - Detect page layout ([layout segmenter](https://huggingface.co/vikp/layout_segmenter), [column detector](https://huggingface.co/vikp/column_detector)) - Clean and format each block (heuristics, [nougat](https://huggingface.co/facebook/nougat-base)) - Combine blocks and postprocess complete text (heuristics, [pdf_postprocessor](https://huggingface.co/vikp/pdf_postprocessor_t5)) Relying on autoregressive forward passes to generate text is slow and prone to hallucination/repetition. From the nougat paper: `We observed [repetition] in 1.5% of pages in the test set, but the frequency increases for out-of-domain documents.` In my anecdotal testing, repetitions happen on 5%+ of out-of-domain (non-arXiv) pages. Nougat is an amazing model, but I wanted a faster and more general purpose solution. Marker is 10x faster and has low hallucination risk because it only passes equation blocks through an LLM forward pass. ## Examples | PDF | Type | Marker | Nougat | |-----------------------------------------------------------------------|-------------|--------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------| | [Think Python](https://greenteapress.com/thinkpython/thinkpython.pdf) | Textbook | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/marker/thinkpython.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/nougat/thinkpython.md) | | [Think OS](https://greenteapress.com/thinkos/thinkos.pdf) | Textbook | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/marker/thinkos.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/nougat/thinkos.md) | | [Switch Transformers](https://arxiv.org/pdf/2101.03961.pdf) | arXiv paper | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/marker/switch_transformers.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/nougat/switch_transformers.md) | | [Multi-column CNN](https://arxiv.org/pdf/1804.07821.pdf) | arXiv paper | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/marker/multicolcnn.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/nougat/multicolcnn.md) | ## Performance ![Benchmark overall](data/images/overall.png) The above results are with marker and nougat setup so they each take ~3GB of VRAM on an A6000. See [below](#benchmarks) for detailed speed and accuracy benchmarks, and instructions on how to run your own benchmarks. # Limitations PDF is a tricky format, so marker will not always work perfectly. Here are some known limitations that are on the roadmap to address: - Marker will convert fewer equations to latex than nougat. This is because it has to first detect equations, then convert them without hallucation. - Whitespace and indentations are not always respected. - Not all lines/spans will be joined properly. - Only languages similar to English (Spanish, French, German, Russian, etc) are supported. Languages with different character sets (Chinese, Japanese, Korean, etc) are not. - This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors. # Installation This has been tested on Mac and Linux (Ubuntu and Debian). You'll need python 3.9+ and [poetry](https://python-poetry.org/docs/#installing-with-the-official-installer). First, clone the repo: - `git clone https://github.com/VikParuchuri/marker.git` - `cd marker` ## Linux - Install system requirements - Optional: Install tesseract 5 by following [these instructions](https://notesalexp.org/tesseract-ocr/html/) or running `scripts/install/tesseract_5_install.sh`. - Install ghostscript > 9.55 by following [these instructions](https://ghostscript.readthedocs.io/en/latest/Install.html) or running `scripts/install/ghostscript_install.sh`. - Install other requirements with `cat scripts/install/apt-requirements.txt | xargs sudo apt-get install -y` - Set the tesseract data folder path - Find the tesseract data folder `tessdata` with `find / -name tessdata`. Make sure to use the one corresponding to the latest tesseract version if you have multiple. - Create a `local.env` file in the root `marker` folder with `TESSDATA_PREFIX=/path/to/tessdata` inside it - Install python requirements - `poetry install` - `poetry shell` to activate your poetry venv - Update pytorch since poetry doesn't play nicely with it - GPU only: run `pip install torch` to install other torch dependencies. - CPU only: Uninstall torch, then follow the [CPU install](https://pytorch.org/get-started/locally/) instructions. ## Mac - Install system requirements from `scripts/install/brew-requirements.txt` - Set the tesseract data folder path - Find the tesseract data folder `tessdata` with `brew list tesseract` - Create a `local.env` file in the root `marker` folder with `TESSDATA_PREFIX=/path/to/tessdata` inside it - Install python requirements - `poetry install` - `poetry shell` to activate your poetry venv # Usage First, some configuration: - Set your torch device in the `local.env` file. For example, `TORCH_DEVICE=cuda` or `TORCH_DEVICE=mps`. `cpu` is the default. - If using GPU, set `INFERENCE_RAM` to your GPU VRAM (per GPU). For example, if you have 16 GB of VRAM, set `INFERENCE_RAM=16`. - Depending on your document types, marker's average memory usage per task can vary slightly. You can configure `VRAM_PER_TASK` to adjust this if you notice tasks failing with GPU out of memory errors. - Inspect the other settings in `marker/settings.py`. You can override any settings in the `local.env` file, or by setting environment variables. - By default, the final editor model is off. Turn it on with `ENABLE_EDITOR_MODEL`. - By default, marker will use ocrmypdf for OCR, which is slower than base tesseract, but higher quality. You can change this with the `OCR_ENGINE` setting. ## Convert a single file Run `convert_single.py`, like this: ``` python convert_single.py /path/to/file.pdf /path/to/output.md --parallel_factor 2 --max_pages 10 ``` - `--parallel_factor` is how much to increase batch size and parallel OCR workers by. Higher numbers will take more VRAM and CPU, but process faster. Set to 1 by default. - `--max_pages` is the maximum number of pages to process. Omit this to convert the entire document. Make sure the `DEFAULT_LANG` setting is set appropriately for your document. ## Convert multiple files Run `convert.py`, like this: ``` python convert.py /path/to/input/folder /path/to/output/folder --workers 10 --max 10 --metadata_file /path/to/metadata.json --min_length 10000 ``` - `--workers` is the number of pdfs to convert at once. This is set to 1 by default, but you can increase it to increase throughput, at the cost of more CPU/GPU usage. Parallelism will not increase beyond `INFERENCE_RAM / VRAM_PER_TASK` if you're using GPU. - `--max` is the maximum number of pdfs to convert. Omit this to convert all pdfs in the folder. - `--metadata_file` is an optional path to a json file with metadata about the pdfs. If you provide it, it will be used to set the language for each pdf. If not, `DEFAULT_LANG` will be used. The format is: - `--min_length` is the minimum number of characters that need to be extracted from a pdf before it will be considered for processing. If you're processing a lot of pdfs, I recommend setting this to avoid OCRing pdfs that are mostly images. (slows everything down) ``` { "pdf1.pdf": {"language": "English"}, "pdf2.pdf": {"language": "Spanish"}, ... } ``` ## Convert multiple files on multiple GPUs Run `chunk_convert.sh`, like this: ``` MIN_LENGTH=10000 METADATA_FILE=../pdf_meta.json NUM_DEVICES=4 NUM_WORKERS=15 bash chunk_convert.sh ../pdf_in ../md_out ``` - `METADATA_FILE` is an optional path to a json file with metadata about the pdfs. See above for the format. - `NUM_DEVICES` is the number of GPUs to use. Should be `2` or greater. - `NUM_WORKERS` is the number of parallel processes to run on each GPU. Per-GPU parallelism will not increase beyond `INFERENCE_RAM / VRAM_PER_TASK`. - `MIN_LENGTH` is the minimum number of characters that need to be extracted from a pdf before it will be considered for processing. If you're processing a lot of pdfs, I recommend setting this to avoid OCRing pdfs that are mostly images. (slows everything down) # Benchmarks Benchmarking PDF extraction quality is hard. I've created a test set by finding books and scientific papers that have a pdf version and a latex source. I convert the latex to text, and compare the reference to the output of text extraction methods. Benchmarks show that marker is 10x faster than nougat, and more accurate outside arXiv (nougat was trained on arXiv data). We show naive text extraction (pulling text out of the pdf with no processing) for comparison. **Speed** | Method | Average Score | Time per page | Time per document | |--------|---------------|---------------|-------------------| | naive | 0.350727 | 0.00152378 | 0.326524 | | marker | 0.641062 | 0.360622 | 77.2762 | | nougat | 0.629211 | 3.77259 | 808.413 | **Accuracy** First 3 are non-arXiv books, last 3 are arXiv papers. | Method | switch_trans.pdf | crowd.pdf | multicolcnn.pdf | thinkos.pdf | thinkdsp.pdf | thinkpython.pdf | |--------|------------------|-----------|-----------------|-------------|--------------|-----------------| | naive | 0.244114 | 0.140669 | 0.0868221 | 0.366856 | 0.412521 | 0.468281 | | marker | 0.482091 | 0.466882 | 0.537062 | 0.754347 | 0.78825 | 0.779536 | | nougat | 0.696458 | 0.552337 | 0.735099 | 0.655002 | 0.645704 | 0.650282 | Peak GPU memory usage during the benchmark is `3.3GB` for nougat, and `3.1GB` for marker. Benchmarks were run on an A6000. **Throughput** Marker takes about 2GB of VRAM on average per task, so you can convert 24 documents in parallel on an A6000. ![Benchmark results](data/images/per_doc.png) ## Running your own benchmarks You can benchmark the performance of marker on your machine. First, download the benchmark data [here](https://drive.google.com/file/d/1WiN4K2-jQfwyQMe4wSSurbpz3hxo2fG9/view?usp=drive_link) and unzip. Then run `benchmark.py` like this: ``` python benchmark.py data/pdfs data/references report.json --nougat ``` This will benchmark marker against other text extraction methods. It sets up batch sizes for nougat and marker to use a similar amount of GPU RAM for each. Omit `--nougat` to exclude nougat from the benchmark. I don't recommend running nougat on CPU, since it is very slow. # Commercial usage Due to the licensing of the underlying models like layoutlmv3 and nougat, this is only suitable for noncommercial usage. I'm building a version that can be used commercially, by stripping out the dependencies below. If you would like to get early access, email me at marker@vikas.sh. Here are the non-commercial/restrictive dependencies: - LayoutLMv3: CC BY-NC-SA 4.0 . [Source](https://huggingface.co/microsoft/layoutlmv3-base) - Nougat: CC-BY-NC . [Source](https://github.com/facebookresearch/nougat) - PyMuPDF - GPL . [Source](https://pymupdf.readthedocs.io/en/latest/about.html#license-and-copyright) Other dependencies/datasets are openly licensed (doclaynet, byt5), or used in a way that is compatible with commercial usage (ghostscript). # Thanks This work would not have been possible without amazing open source models and datasets, including (but not limited to): - Nougat from Meta - Layoutlmv3 from Microsoft - DocLayNet from IBM - ByT5 from Google Thank you to the authors of these models and datasets for making them available to the community!

### #396: astra-assistants-api: A backend implementation of the OpenAI beta Assistants API

### Details

Similarity score: 0.83 - [ ] [datastax/astra-assistants-api: A backend implementation of the OpenAI beta Assistants API](https://github.com/datastax/astra-assistants-api) Astra Assistant API Service ============================= A drop-in compatible service for the OpenAI beta Assistants API with support for persistent threads, files, assistants, messages, retrieval, function calling and more using AstraDB (DataStax's db as a service offering powered by Apache Cassandra and jvector). Compatible with existing OpenAI apps via the OpenAI SDKs by changing a single line of code. Getting Started --------------- 1. **Create an Astra DB Vector database** 2. Replace the following code: ```python client = OpenAI( api_key=OPENAI_API_KEY, ) ``` with: ```python client = OpenAI( base_url="https://open-assistant-ai.astra.datastax.com/v1", api_key=OPENAI_API_KEY, default_headers={ "astra-api-token": ASTRA_DB_APPLICATION_TOKEN, } ) ``` Or, if you have an existing astra db, you can pass your db\_id in a second header: ```python client = OpenAI( base_url="https://open-assistant-ai.astra.datastax.com/v1", api_key=OPENAI_API_KEY, default_headers={ "astra-api-token": ASTRA_DB_APPLICATION_TOKEN, "astra-db-id": ASTRA_DB_ID } ) ``` 3. **Create an assistant** ```python assistant = client.beta.assistants.create( instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.", model="gpt-4-1106-preview", tools=[{"type": "retrieval"}] ) ``` By default, the service uses AstraDB as the database/vector store and OpenAI for embeddings and chat completion. Third party LLM Support ----------------------- We now support many third party models for both embeddings and completion thanks to litellm. Pass the api key of your service using `api-key` and `embedding-model` headers. For AWS Bedrock, you can pass additional custom headers: ```python client = OpenAI( base_url="https://open-assistant-ai.astra.datastax.com/v1", api_key="NONE", default_headers={ "astra-api-token": ASTRA_DB_APPLICATION_TOKEN, "embedding-model": "amazon.titan-embed-text-v1", "LLM-PARAM-aws-access-key-id": BEDROCK_AWS_ACCESS_KEY_ID, "LLM-PARAM-aws-secret-access-key": BEDROCK_AWS_SECRET_ACCESS_KEY, "LLM-PARAM-aws-region-name": BEDROCK_AWS_REGION, } ) ``` and again, specify the custom model for the assistant. ```python assistant = client.beta.assistants.create( name="Math Tutor", instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.", model="meta.llama2-13b-chat-v1", ) ``` Additional examples including third party LLMs (bedrock, cohere, perplexity, etc.) can be found under `examples`. To run the examples using poetry: 1. Create a `.env` file in this directory with your secrets. 2. Run: ```shell poetry install poetry run python examples/completion/basic.py poetry run python examples/retreival/basic.py poetry run python examples/function-calling/basic.py ``` ### Coverage See our coverage report [here](your-coverage-report-link). ### Roadmap - Support for other embedding models and LLMs - Function calling - Pluggable RAG strategies - Streaming support #### Suggested labels #### { "key": "llm-function-calling", "value": "Integration of function calling with Large Language Models (LLMs)" }

irthomasthomas / undecidability