low-bit Search Results - Githubissues

pytorch/ao #697

[RFC] Which low bit CUDA kernels should we merge or write?

Here is my understanding of the existing state of things and what I think we should be doing to make our lower-bit kernels more performant at both small and larger batch sizes. I'm making this an RFC …

msaroufim updated 2 weeks ago

z-huang/InnerTune #1527

What are the differences in bit rate and codec between auto,…

### Checklist - [X] I've checked that there is no other issue about this feature request. - [X] This issue contains only one feature request. - [X] The title of this issue accurately describes the fe…

serrq updated 1 week ago

lllyasviel/stable-diffusion-webui-forge #1624

GGUF Q8_0 and Automatic Diffusion (8bit LoRa) generating poo…

Since ba01ad37, LoRas loaded in 8bit to the Q8_0 GGUF generate to a poor quality. Loading the LoRa in 16bit appears to fix this issue, but there are subtle differences in the generations from rounding…

blakejrobinson updated 3 days ago

intel-analytics/ipex-llm #11360

vLLM CPU example load-in-low-bit is not used

During testing with the --load-in-low-bit features with the vLLM for CPU example. I noticed the model is not optimized based on this option. I found that it needs to pass in the load_in_low_bit ar…

noobHappylife updated 2 months ago

intel-analytics/ipex-llm #11407

Add multi GPU support in the AutoModelForCausalLM.load_low_b…

Every time when I run the test, it will load the original model and covert to lower bit. If we load a 34B model on 4 ARC card, it will take a long time to covert the model and also need huge number o…

oldmikeyang updated 2 months ago

lllyasviel/stable-diffusion-webui-forge #1222

can't run t5 in fp16?

I tried many unet setting like dev-fp16 with automatic Diffusion in Low Bits/ dev-fp16 with fp8e4m3fn in low bits/ dev-fp8_e4m3fn with automatic Diffusion in Low Bits. but for every single unet settin…

WingeD123 updated 3 weeks ago

intel-analytics/ipex-llm #10729

Saving of low-bit models for later loading?

Dear Ipex Team, I was wondering if there was a way of saving a model that has been optimised and quantised in its new state for future loading for HF/Pytorch models. I noticed there was a method i…

ElliottDyson updated 5 months ago

jmbrugge/pyrit #379

PMKs/s seem a bit low?

``` What steps will reproduce the problem? 1.just running a benchmark 2. 3. What is the expected output? What do you see instead? Computed 2274.76 PMKs/s total. #1: 'CUDA-Device #1 'GeForce 8400 GS''…

GoogleCodeExporter updated 9 years ago

Fahad0x9d3/pyrit #379

PMKs/s seem a bit low?

GoogleCodeExporter updated 8 years ago

Obcykany/pyrit #379

PMKs/s seem a bit low?

GoogleCodeExporter updated 8 years ago

1000+ results for low-bit

1000+ results
for low-bit