inference-acceleration Search Results

1000+ results
for inference-acceleration

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

exo-explore/exo #352

[BOUNTY - $1000] Get exo Python node running on iOS

# Background I experimented with a rust-based exo implementation that used [UniFFI](https://mozilla.github.io/uniffi-rs/latest/) for foreign language bindings so I could run it from a Swift iOS app…

AlexCheema updated 4 days ago
7
spiceai/spiceai #3100

Enhancement: Separate Tokio runtimes for query/AI computatio…

## Goal-State/What/Result Have separate Tokio runtimes for query/AI computation (inference, embeddings, etc) and acceleration refreshes. The goal is that all queries should stay fast no matter what e…

phillipleblanc updated 1 day ago
1
NVIDIA/TensorRT #4023

The KL divergence calculation is very slow and is not optimi…

## Description I tried to quote the following documents directly，tools/pytorch-quantization/pytorch_quantization/calib/histogram.py，and Use HistogramCalibrator.compute_amax() to calculate the max…

yychen2000 updated 3 months ago
3
b4rtaz/distributed-llama #133

Android devices Support

Hello! Thank you very much for your excellent work, which enables the distributed running of large models on heterogeneous devices! I was wondering if this project supports Android devices. I am curre…

qtyandhasee updated 1 day ago
5
unslothai/unsloth #768

AutoModelForSequenceClassification or output is only one to…

I am using AutoModelForSequenceClassification for classifying a large model. Can I use this library, and how should I use it? Additionally, if my output is only one token and I do batch inference, w…

shyoulala updated 3 months ago
3
janhq/cortex.cpp #1459

bug: Inference with NVIDIA GPU stop working after resuming f…

**Describe the bug** After resuming from sleep, inference with nvidia GPU doesn't work until restarting the system. **Steps to reproduce** Steps to reproduce the behavior: 1. Setup GPU Accelerat…

barbicane updated 1 week ago
7
time-series-foundation-models/lag-llama #105

Problem on test set inference on a single machine with multi…

Hi, thank for your great work! Your code is based on the lightning torch. When i deployed the model on a single machine with multiple GPUs, it started several GLOBAL processes, which is necessary for …

mayii2001 updated 2 weeks ago
1
dotnet/machinelearning #7162

Running AI inference of phi3 and other llms from c# using NP…

Intel, AMD, Qualqomm, etc are getting powerful NPUs (+40TOPS) for inferencing. Is there any plan to incluide in ml.net functionality to be able to run and inference these models easily from C# offl…

agonzalezm updated 1 month ago
3
AiuniAI/Unique3D #26

Is the lambdalabs/sd image variations diffusers model sd1.4

Is the lambdalabs/sd image variations diffusers model sd1.4? Is it possible to use the SD1.5 model to generate acceleration through a few steps of hypersd, or is there any other solution that can opti…

libai-lab updated 4 months ago
2
mlc-ai/mlc-llm #2673

[Feature Request] Addition of Qualcomm NPU devices inference

## 🚀 Feature Addition of Qualcomm NPU devices inference ## Motivation Qualcomm released an AI sdk which includes ability for models to run on it's Qualcomm® Hexagon™ NPU, adding this feature wo…

TriDefender updated 2 months ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for inference-acceleration

1000+ results
for inference-acceleration