PygmalionAI / aphrodite-engine

Large-scale LLM inference engine
https://aphrodite.pygmalion.chat
GNU Affero General Public License v3.0
1.12k stars 123 forks source link

[Bug]: .\gguf_to_torch.py broken along with direct load GGUF #804

Open sorasoras opened 1 week ago

sorasoras commented 1 week ago

Your current environment

The output of `python env.py` ```text python env.py Collecting environment information... W:\windows_cuda\aphrodite-engine\aphrodite\common\logger.py:66: SyntaxWarning: invalid escape sequence '\<' message = message.replace("{", "{{").replace("}", "}}").replace("<", "\<") PyTorch version: 2.4.1+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build PyTorch: N/A OS: Microsoft Windows 11 Pro GCC version: (MinGW-W64 x86_64-ucrt-posix-seh, built by Brecht Sanders, r8) 13.2.0 Clang version: Could not collect CMake version: version 3.29.2 Libc version: N/A Python version: 3.12.5 (tags/v3.12.5:ff3bc82, Aug 6 2024, 20:45:27) [MSC v.1940 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-11-10.0.22631-SP0 Is CUDA available: True CUDA runtime version: 12.4.99 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: Tesla P40 Nvidia driver version: 551.61 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True CPU: Architecture=9 CurrentClockSpeed=4201 DeviceID=CPU0 Family=107 L2CacheSize=16384 L2CacheSpeed= Manufacturer=AuthenticAMD MaxClockSpeed=4201 Name=AMD Ryzen 9 7950X3D 16-Core Processor ProcessorType=3 Revision=24834 Versions of relevant libraries: [pip3] numpy==1.26.4 [pip3] pyzmq==26.2.0 [pip3] torch==2.4.1+cu124 [pip3] transformers==4.45.2 [pip3] triton==3.1.0 [conda] Could not collect ROCM Version: Could not collect Neuron SDK Version: N/A Aphrodite Version: 0.6.3.post1 Aphrodite Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: Could not collect```

🐛 Describe the bug

.\examples\gguf_to_torch.py --input .\sakura-14b-qwen2.5-v1.0-iq4xs.gguf --output .\out\ --unquantized-path .\Qwen2.5-14b-instruct\ Traceback (most recent call last): File "W:\windows_cuda\aphrodite-engine\examples\gguf_to_torch.py", line 5, in from aphrodite.modeling.hf_downloader import convert_gguf_to_state_dict ModuleNotFoundError: No module named 'aphrodite.modeling.hf_downloader'

AlpinDale commented 1 week ago

That script is outdated, I forgot to remove it. We've been doing implicit conversion of GGUF models for a very long time.

sorasoras commented 1 week ago

That script is outdated, I forgot to remove it. We've been doing implicit conversion of GGUF models for a very long time.

the problem is implicit gguf is broken for the moment