vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.
https://docs.vectorch.com/
Apache License 2.0
315 stars 23 forks source link

ScaleLLM Roadmap #84

Open guocuimi opened 3 months ago

guocuimi commented 3 months ago

We're excited to present the features we're currently working on and planning to support in this roadmap document. Your feedback is highly valued, so please don't hesitate to comment or reach out if you have anything you'd like to add or discuss. We're committed to delivering the best possible experience with ScaleLLM.

Q1-Q2 2024

Efficiency

Cache

New Models

New Devices

Usability

New GPU Architecture

Structural Decoding

Quantization

Supported Operating Systems

Misc

omarmhaimdat commented 2 months ago

I think LLaMA 3 should be added as well, and probably should be high priority.

guocuimi commented 2 months ago

I think LLaMA 3 should be added as well, and probably should be high priority.

Yes, Llama3 is supported already, please check latest release. https://github.com/vectorch-ai/ScaleLLM/releases/tag/v0.0.8

omarmhaimdat commented 2 months ago

Woow @guocuimi, thank you for your quick update! You guys rock !