fp8 Search Results - Githubissues

1000+ results
for fp8

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/BitBLAS #160

TensorIntrin 'mma_i8i8f16_smooth_a_trans_b_smooth_b' is not …

I want to use INT8 matmul , and the code/output is as follows: ### Code ``` import bitblas import torch bitblas.set_log_level("Debug") matmul_config = bitblas.MatmulConfig( M=16, # M dime…

huanpengchu updated 2 weeks ago
5
Lightning-AI/pytorch-lightning #18679

Support combinations of precision plugins

### Description & Motivation Both the Fabric and Trainer strategies are designed to have a single plugin enabled from the beginning to the end of the program. This has been fine historically, ho…

carmocca updated 3 days ago
13
lllyasviel/stable-diffusion-webui-forge #1261

FLUX doesnt start

I've tried to use different releases of forge (cu121 and torch21; cu121 and torch231; cu124 and torch24) but i get error on loading flux1-dev-fp8 model Also i tried to change GPU Weights or swap loca…

Lolikcrafter updated 1 month ago
10
openmm/openmm #4553

OpenMM in FP16 or FP8 floating-point precision

Hi, Has anyone tried OpenMM in floating-point precision lower than FP32? Can one still run simulations in FP16 or FP8? Which operations could be ideally moved to lower precision? Thanks!

razvanmarinescu updated 2 months ago
12
NVIDIA/TensorRT-LLM #1741

Quantizing Phi-3 128k Instruct to FP8 fails.

### System Info - GPU name: L40s - CUDA: 12.1 ``` Wed Jun 5 16:27:21 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.14 …

kalradivyanshu updated 2 months ago
10
vllm-project/llm-compressor #148

When can we support w8a8 fp8 quantization and sparse2:4 llm …

leoyuppieqnew updated 1 week ago
1
comfyanonymous/ComfyUI_bitsandbytes_NF4 #26

Please add option to adjust GPU Weight since my gpu only has…

Please add option to adjust GPU Weight since my gpu only has 6GB Vram my RTX 3060 laptop can run with normal fp8 within 100-150 sec, but it talk super long with nf4 (my gpu run 99% all the time an…

ThepExcel updated 1 month ago
1
NVIDIA/TensorRT-LLM #421

Clarification of H100 FP8 performance numbers

I'm currently testing Llama2 70B on DGX-A100 and DGX-H100. I'm running the gptManagerBenchmark as described [here](https://github.com/NVIDIA/TensorRT-LLM/tree/release/0.5.0/benchmarks/cpp) and compari…

jfolz updated 4 months ago
11
comfyanonymous/ComfyUI_bitsandbytes_NF4 #32

Allocation on device

I'm having this error which i assume indicates that i don't have enough vram, however i'm able to run the FP8 version of the flux- dev and this exact same model on forge webui with no issues at all, s…

ALBIHANY updated 1 month ago
5
kijai/ComfyUI-CogVideoXWrapper #22

"mul_cuda" not implemented for 'Float8_e4m3fn'

I can't use fp8 transformer on my 3090 Ti, 24 GB. Tried PyTorch nightly (2.5.0) and the latest release 2.4.0, same error every time: ![2024-08-28_12-14-24](https://github.com/user-attachments/assets/…

rkfg updated 3 weeks ago
11

上一页 1...18 19 20 21 22 23 24...100 下一页

1000+ results for fp8

1000+ results
for fp8