PygmalionAI aphrodite-engine issues

PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine

https://pygmalion.chat

GNU Affero General Public License v3.0

660 stars 80 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Usage]: OOM crash following Offline Inference setup

#494 eedmond opened 18 hours ago
3
[Bug]: SnowStorm-v1.15-4x8B: Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=BROADCAST, NumelIn=128, NumelOut=128, Timeout(ms)=600000)

#493 josephrocca opened 1 day ago
0
[Feature]: An alternative to `max_tokens` which defaults to `minimum(max_tokens, remaining_tokens)`

#492 josephrocca opened 2 days ago
0
[Bug]: /metrics Endpoint Returns 404

#491 adsf0427 opened 3 days ago
0
Fix gguf for mixtral.

#490 sgsdxzy opened 3 days ago
0
[Bug]: unable use all the vram in wsl cuda environment

#489 sorasoras closed 4 days ago
0
[Misc]: INT8 kv quant seems removed.

#488 sorasoras opened 4 days ago
0
[Feature]: WARNING: Model is quantized. Forcing float16 datatype

#487 sorasoras opened 4 days ago
4
[Usage]: Higher Context Length.

#486 Abulhanan opened 4 days ago
2
[Bug]: [rank0]: KeyError: 'input_ids'

#485 ChuanhongLi closed 3 days ago
2
[Bug]: Moe's no longer working

#484 puppetm4st3r opened 5 days ago
1
[Installation]: Docker runs out of CPU swap size on 8 GPUs. How to lower swap_space to be less than 4GB per GPU?

#483 elabz opened 6 days ago
1
[Bug]: Cannot load Mixtral GGUF model?

#482 Nero10578 opened 6 days ago
13
[0.5.4] Release Candidate

#481 AlpinDale opened 1 week ago
0
[Bug]: Fails to start with error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte

#480 Nero10578 closed 6 days ago
2
Update kobold lite version

#479 Pyroserenus opened 1 week ago
0
[New Model]: Phi3ForCausalLM

#478 sparsh35 opened 1 week ago
0
[Bug]: Running aphrodite throws ImportError

#477 reuank opened 1 week ago
2
[Feature]: request for support DeepseekV2ForCausalLM.

#476 kk3dmax opened 1 week ago
0
[Bug]: Int8 k/v cache calibrate don't work with QWen model?

#475 bash99 opened 1 week ago
0
Try to fix gguf.

#474 sgsdxzy opened 1 week ago
0
[Bug]: Cannot load llama-3 gguf based models

#473 EugeoSynthesisThirtyTwo closed 1 day ago
1
[Bug]: torch._dynamo.exc.BackendCompilerFailed with command-r-plus

#472 heungson opened 2 weeks ago
3
[Bug]: Cannot load 70b exl2 5bpw model across 4 GPUs.

#471 Ph0rk0z opened 2 weeks ago
11
[Bug]: GPUExecutor throwing 'TypeError: 'type' object is not subscriptable' on 0.5.3

#470 xyzkpsf closed 2 weeks ago
2
Fix quants installation on ROCM

#469 Naomiusearch closed 2 weeks ago
1
[Bug]: Flash attention cannot be used on v0.5.3

#468 Nero10578 opened 2 weeks ago
7
Bump `torch` to 2.3.0

#467 AlpinDale closed 3 weeks ago
0
Fix Navi support

#466 Naomiusearch closed 3 weeks ago
1
Dockerfile: permission update, configurable build jobs, torch 2.3.0

#465 theobjectivedad closed 3 weeks ago
5
[Usage]: Lora Adapter Parameter while inferencing

#464 alokgupta1996 closed 3 weeks ago
1
[Feature]: Exllamav2 Q4 cache

#463 Anthonyg5005 opened 3 weeks ago
2
fix: lora errors

#462 AlpinDale closed 3 weeks ago
0
[Bug]: LoRA fails to load

#461 kubernetes-bad closed 3 weeks ago
1
[Bug]: LoRA broken when TP>1

#460 kubernetes-bad opened 3 weeks ago
0
Fix recursion errors with large amounts of blocks

#459 thomas-xin closed 2 weeks ago
3
[Bug]: PermissionError: [Errno 13] Permission denied: '/app/aphrodite-engine/.triton'

#458 theobjectivedad closed 3 weeks ago
3
Fixed REVISION variable not being passed on.

#456 houmie closed 3 weeks ago
0
Fix to https://github.com/PygmalionAI/aphrodite-engine/issues/318

#455 houmie closed 3 weeks ago
3
Refactor: Quantization

#454 AlpinDale closed 3 weeks ago
1
[Installation]: Installing from source does not work. undefined symbol: _ZN3c104cuda14ExchangeDeviceEa

#453 Nero10578 closed 2 weeks ago
8
[Usage]: What to set to get acceptable performance on Pascal GPUs? (Non-P100)

#452 Nero10578 closed 2 weeks ago
2
[Installation]: Upload Aphrodite v0.5.2 On Pypi.org

#451 Abulhanan closed 2 weeks ago
3
feat: add batch tokenization endpoint along with option for no token ids

#450 ahme-dev closed 2 weeks ago
0
Fix minor bugs in outlines and lmfe.

#449 sgsdxzy closed 3 weeks ago
1
[Installation]: ValueError: 17 is not a valid GGMLQuantizationType

#448 Abulhanan closed 1 month ago
21
[Performance]: Memory Usage Fix for gguf.

#447 Abulhanan closed 1 month ago
3
[Usage]: Please provide the environment variable that closes the KoboldAI Lite page.

#445 online2311 opened 1 month ago
0
fix: restore backwards compatibility with sm_60 (P100 and GP100)

#444 AlpinDale closed 1 month ago
1
Fix out-of-range token crash in OpenAI endpoint

#443 50h100a closed 1 month ago
0