fp8 Search Results - Githubissues

1000+ results
for fp8

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

bmaltais/kohya_ss #2738

Flux CLIP_L Text Encoder training support.

Flux LORA training is bleeding concepts, if you train many people at the same time get all mixed, is ignoring the unique tokens assigned to each character, I think the problem is that is training only…

dsienra updated 3 days ago
5
neuralmagic/AutoFP8 #2

Bitblas supports FP8 Inference as well

Hello @mgoin, it's a pleasant surprise to discover this project. Thank you for your contributions to BitBLAS. We have recently added support for FP8 Matmul, hoping it will help this project.

LeiWang1999 updated 4 months ago
3
lllyasviel/stable-diffusion-webui-forge #1252

Fix loras having a weak effect when applied on fp8. - [Comfy…

Please see this commit that Comfy pushed earlier today that fixes the issue where some Flux LoRA are very weak when using along w/ fp8. It would be great if Forge were similarly updated so there is co…

CCpt5 updated 4 weeks ago
7
NVIDIA/TransformerEngine #982

[PyTorch] How to restore fp8 amp training from checkpoint

Hello! I try to implement a multi-stage training with fp8 autocast. However, when I load checkpoint from first training stage using torch's `load_state_dict(...)`, loss quickly explodes. Are th…

alexdremov updated 1 month ago
3
Azure/MS-AMP #123

V0.4 Release Plan

# Release Manager @cp5555 # Endgame - [x] Code freeze: Feb. 9th, 2024 - [x] Bug Bash date: Feb. 12th, 2024 - [x] Release date: Feb. 23rd, 2024 # Main Features ## MS-AMP O3 Optimization -…

cp5555 updated 1 month ago
1
lllyasviel/stable-diffusion-webui-forge #1633

banding artifacts in hires fix

Using hires makes vertical/horizontal banding artifects. I have tested with different hires upscalers some are less noticable, but it is visible depending on the image. Also it is more prominent if we…

Korner83 updated 3 days ago
8
sgl-project/sglang #634

Development Roadmap (2024 Q3)

Here is the development roadmap for 2024 Q3. Contributions and feedback are welcome. ## Server API - [ ] Add APIs for using the inference engine in a single script without launching a separate se…

Ying1123 updated 13 hours ago
18
intel/intel-xpu-backend-for-triton #2177

[Performance] Improve the flash attention performance on bot…

This issue is to track the new design required for flash-attention on bottom-up optimization pipeline. ## Status The most of the optimization passes has been finished and been checked in llvm-targ…

chengjunlu updated 1 day ago
2
triton-lang/triton #4319

error: fp8e4nv data type is not supported on CUDA arch < 89

https://github.com/triton-lang/triton/blob/95623038c75463286aa5d4a44782ba7492cc1afa/python/triton/language/semantic.py#L761C1-L763C1 how to resolve this

yiyepiaoling0715 updated 3 weeks ago
2
vllm-project/vllm #6537

[Bug]: vllm not support fp8 kv cache when use flashinfer

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A…

kuangdao updated 1 month ago
7

上一页 1...16 17 18 19 20 21 22...100 下一页

1000+ results for fp8

1000+ results
for fp8