fp8 Search Results - Githubissues

1000+ results
for fp8

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Panchovix/stable-diffusion-webui-reForge #44

[Feature Request]: Add "Cache FP16 weight for LoRA" setting …

### Is there an existing issue for this? - [X] I have searched the existing issues and checked the recent builds/commits ### What would your feature do ? In both UIs A1111 and Forge in **Opti…

Animus777 updated 2 months ago
1
vllm-project/vllm #7471

[Bug]: FP8 Quantization support for AMD GPUs

### Your current environment I am trying out FP8 support on AMD GPUs (MI250, MI300) and the vLLM library does not seem to support AMD GPUs yet for FP8 quantization. Is there any timeline for when thi…

rathnaum updated 1 month ago
1
NVIDIA/cutlass #1698

[QST] Dose CuTe supports FP8 in Ada lovelace?

Hi: I'd like to test FP8 in RTX 4090. I can find some BF16 functions like SM80_16x8x8_F32BF16BF16F32_TN in cutlass/include/cute/arch/mma_sm80.hpp, however, I can't find some FP8 functions like SM80_1…

Godlovecui updated 2 weeks ago
9
aredden/flux-fp8-api #13

LoRA loading fails if only trained on specific blocks

Just FYI, think this is failing because of a LoRA with only certain blocks trained: ``` File "flux-fp8-api/flux_pipeline.py", line 163, in load_lora self.model = lora_loading.apply_lora_to_…

fblissjr updated 19 hours ago
21
NVIDIA/TensorRT-LLM #2072

`-1` token id with Mixtral FP8 and tensorrt_llm 0.11.0

- CPU architecture: x86_64 - GPU: NVIDIA H100 - Libraries - TensorRT-LLM: v0.11.0 - TensorRT: 10.1.0 - Modelopt: 0.13.1 - CUDA: 12.3 - NVIDIA driver version: 535.129.03 Hello, I'm e…

v-dicicco updated 3 weeks ago
6
NVIDIA/TensorRT-LLM #2204

QWenForCausalLM/transformer/vocab_embedding/embedding/GATHER…

### System Info GPU Name: 8 * H20 TensorRT-LLM : 0.11.0 NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.4 ### Who can help? _No response_ ### Information - [x] The official exam…

zymy-chen updated 6 days ago
4
pytorch/pytorch #134706

Better namings for triton fusion ops when a custom triton ke…

### 🚀 The feature, motivation and pitch Hi, the code can run fine. It is just that the generated comments and names are a bit confusing. Say we have a function with some torch ops at the beginning…

henrylhtsang updated 1 month ago
1
Dao-AILab/flash-attention #1048

Compatibility of Flash Attention 3 FP8 Feature with L40 and …

Thanks for open-sourcing FA3, good job! I am wondering about the FP8 feature. **Compatibility**: Are the NVIDIA L40 and A100 GPUs compatible with the Flash Attention 3 FP8 feature? **Performance…

feifeibear updated 1 month ago
6
NVIDIA/TransformerEngine #1135

fp8_model_init doesn't work with DDP

When I'm trying to use `fp8_model_init` feature, it doesn't seem compatible with DDP. It throws an error: `RuntimeError: Modules with uninitialized parameters can't be used with "DistributedDataParal…

MaciejBalaNV updated 1 month ago
3
AINativeLab/awesome-flux-ai #6

Adding quantitative models like fp8 gguf etc.

It is recommended to add a category of information about quantified models, which are a significant part of the interest in flux Emphasis on fp8, gguf q8, bnb-nf4 and more

NHLOCAL updated 1 month ago
1

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for fp8

1000+ results
for fp8