inference-benchmark Search Results

1000+ results
for inference-benchmark

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/T-MAC #32

8gen3 T-MAC cpu performance issue

hi there, I am using a 8Gen3(Xiaomi14 Pro 68GB/s bw) and following the Android Cross Compilation Guidance Option.1: Use Prebuilt Kernels guide to test llama-2-7b-4bit token generation performance. it…

AndreaChiChengdu updated 1 month ago
9
huggingface/diffusers #7785

Very Slow first inference with diffusers 0.27.X

### Describe the bug Hello diffusers team ! I face an annoying issue since I upgraded the diffusers version to 0.27.X The first call (and only the first) of pipeline(...) takes now a lot of time …

nesscube updated 1 month ago
6
scikit-hep/pylhe #181

docs: Add list of pylhe citations

We should get docs up in general, but @lukasheinrich pointed out that we should probably be tracking pylhe citations as well. This is what I have just from `https://www.google.com/search?q=pylhe+si…

matthewfeickert updated 1 year ago
4
nod-ai/SHARK-Studio #428

Unecessary memory usage with SharkInference

The easiest place to see the excess memory usage is with stable_diffusion, e.g. ``` python shark/examples/shark_inference/stable_diffusion/main.py --precision=fp32 ``` Where we see ~24 GB of mem…

qedawkins updated 1 year ago
1
SYSTRAN/faster-whisper #598

Incorporating flash-attention2 [SOLVED] and subsequent testi…

Hello all. Just thought I'd post a question about Flash Attention 2 here: [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention) Apparently it's making big…

BBC-Esq updated 3 months ago
19
blei-lab/edward #464

Edward Roadmap

Following the [2017 TensorFlow Dev Summit](https://events.withgoogle.com/tensorflow-dev-summit/#content), here is an outline of Edward going forward at least for Spring 2017. Of course, comments are …

dustinvtran updated 4 years ago
5
pytorch/serve #1692

TorchServe How to Curl Multiple Images Properly

I am using TorchServe to potentially serve a model from MMOCR (https://github.com/open-mmlab/mmocr), and I have several questions: 1. I tried to do inference on hundreds of images together using batc…

Hegelim updated 2 months ago
8
bigscience-workshop/Megatron-DeepSpeed #163

[Tensorboard] Log text prediction in evaluation

A very useful tool in order to understand model performance beyond obtaining loss: Actually show what are the predictions. It'd be very useful to be able to "see" the output of the model during eva…

thomasw21 updated 2 years ago
14
dask/dask-ml #697

Include Bayesian sampling in Hyperband implementation

The paper "[BOHB: Robust and Efficient Hyperparameter Optimization at Scale][1]" includes an interesting parallelization technique for Bayesian sampling in a Hyperband implementation. In Section 4.2 t…

stsievert updated 2 years ago
6
PaddlePaddle/PaddleClas #1114

paddleclas识别1张图太慢

paddlepaddle2.1.1-cpu版 paddleclas 用的模型和数据为：https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/zh_CN/tutorials/quick_start_recognition.md中给出的识别一张图在： 1、win10下（pc机）： Inference: 2525…

davidwkx updated 7 months ago
6

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for inference-benchmark

1000+ results
for inference-benchmark