int8-inference Search Results

1000+ results
for int8-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/ao #789

'FakeTensor' object has no attribute 'layout_type'

Getting a big whopper of an error trying to apply the optimizations shown in the [directions for CogVideoX](https://huggingface.co/THUDM/CogVideoX-2b) ``` -----------------------------------------…

WAS-PlaiLabs updated 4 days ago
4
valdivj/Deepstream-IGN-Maker-YOLO #1

yolov3.weights file missing

Hii there, I cloned this repo to run deepstream_ignition_IP_File_rtsp_yolo.py file , but it turns out that yolo.weights file is missing can you please upload or share the link to it...

utkarsh23kushwaha updated 1 year ago
6
huggingface/optimum-quanto #232

Why latency of quantized model is even more than unquantized…

Based on my understanding, a quantized model (e.g., INT8 version) should run faster than an FP32 model, since the hardware has a specific acceleration unit for INT8 data computation. But the experimen…

ZhangYuef updated 1 month ago
3
kserve/kserve #1523

Need details on KFS V2 based GRPC method for inference

We have a few questions specific to KFS V2 based GRPC method for inference. 1. Is KFS V2 also meant to support tabular data based payload or only for tensor based ML/DL workload ? 2. As KFS V2 …

vegoutha updated 3 years ago
9
Stability-AI/StableLM #17

GPU support Table & VRAM usage

It would be great to get the instructions to run the 3B model locally on a gaming GPU (e.g. 3090/4090 with 24GB VRAM). ### Confirmed GPUs From this thread | GPU Model | VRAM (GB) | Tuned-3b | T…

enricoros updated 1 year ago
34
open-mmlab/mmpose #9

Roadmap of MMPose

We keep this issue open to collect feature requests from users and hear your voice. Our monthly release plan is also available here. You can either: 1. Suggest a new feature by leaving a comment. …

hellock updated 1 year ago
76
HazyResearch/ThunderKittens #23

[Feature Request] GEMM benchmarks and FP8 Support

I really like the simplicity of TK and think it could be broadly applicable to kernel authoring beyond attention. Has there been any benchmarking done of pure GEMM operations? If so, an example would …

jwfromm updated 3 months ago
7
Samsung/ONE #7598

[luci-interpreter] Benchmark modes from tflite-micro

This is continuation of #5080, but with more specific goals. ### Goal To measure luci-interpreter performance and memory consumption for models delivered with tflite-micro: https://github.com/tens…

binarman updated 2 years ago
31
mlfoundations/open_clip #835

Triton error in int8-support

I am following the int8 tutorial at https://github.com/mlfoundations/open_clip?tab=readme-ov-file#int8-support but I cannot make it work with the latest version of open clip. Installing the require…

aleablu updated 1 month ago
2
pytorch/pytorch #131768

DISABLED test_retrace_export_while_loop_simple_cpu_float32 (…

Platforms: linux This test was disabled because it is failing in CI. See [recent examples](https://hud.pytorch.org/flakytest?name=test_retrace_export_while_loop_simple_cpu_float32&suite=TestHOPCPU&li…

pytorch-bot[bot] updated 3 weeks ago
4

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for int8-inference

1000+ results
for int8-inference