TBD-Labs-AI/latched - Githubissues

ML models latches onto devices

Latched provides easy-to-use pipelines to perform ML models on various devices such as mobile, Nvidia jetson, Intel CPUs, and accelerators. Latched covers both converting models and deploying them(Latched Model Manager, Latched Devices SDKs).

🤖 Supported ML Tasks

📚 Text:

Small Language Models, to embed chat-bot or text analysis on device
- Llama-3.1-8B-Instruct + OmniQuantW3A16 @ iPhone 15 Pro (coming soon)
- Other models will be supported soon
Other tasks will be supported soon

🏞️ VIsion:

Object Detection (coming soon)
Image Classification (coming soon)
Other tasks will be supported soon

🗣️ Audio:

Speech to Text, Automatic Speech Recognition (coming soon)
Other tasks will be supported soon

Supported Frameworks:

🧩 Latched Components

Latched: Latched python library provides hardware-aware optimization. With this library, you can export your ML model into hardware-optimized forms.

Latched Model Manager: Latched model manager provides a RESTful API to register and run ML models on various devices.

Latched Devices SDKs: Latched devices SDKs provide libraries to run ML models on various devices.

🚀 Getting Started

Installation

Clone the repository

git clone https://github.com/TBD-Labs-AI/latched.git
cd latched

Make the virtual environment with Python 3.11.9 and activate it.
```
conda create -n latched python=3.11.9
conda activate latched
```
Install the dependencies with Poetry
```
pip install poetry
poetry install
```

Launch the test script (onnx export)

python examples/llama-3.1-8B-Instruct-to-onnx/llama_onnx_example.py

How to use Latched

📚 Model Hub

coming soon

Contributing

Do you believe the future of AI is on edge computing? Do you want to make it happen? Join Latched as a contributor! If you want to contribute to Latched, please read the CONTRIBUTING.md file.

📅 Milestones

SEP 2024

[ ] Optimize Phi 3.5 mini model
- [ ] Export Phi 3.5 mini model to
- [ ] CoreML
- [ ] TensorFlow Lite
- [ ] TensorRT
- [ ] OpenVINO
- [ ] ONNX
- [ ] Optimize Phi 3.5 mini model to
- [ ] Apple iPhone 15 Pro
- [ ] Samsung Galaxy S24
- [ ] Nvidia Jetson
- [ ] Intel CPU
- [ ] Intel Gaudi2
- [ ] Rebellion ATOM
- [ ] AWS Inferentia
[ ] Register Phi 3.5 mini model to Model Manager
[ ] Create Swift example code to run
- [ ] Phi 3.5 mini model on Apple iPhone 15 Pro
- [ ] Phi 3.5 mini model on Samsung Galaxy S24
- [ ] Phi 3.5 mini model on Nvidia Jetson
- [ ] Phi 3.5 mini model on Intel CPU
- [ ] Phi 3.5 mini model on Intel Gaudi2
- [ ] Phi 3.5 mini model on Rebellion ATOM
- [ ] Phi 3.5 mini model on AWS Inferentia
[ ] Release Benchmark Dashboard of Phi 3.5 mini model on each devices
🤝 Acknowledgements

This repository uses the following third-party libraries: