ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.59k stars 9.41k forks source link

Fix tensor groups for encoder-decoder models in gguf-dump.py #8090

Closed fairydreaming closed 3 months ago

fairydreaming commented 3 months ago

t5-small-dump.txt This PR corrects tensor groups for encoder-decoder models like T5 and FLAN-T5 family. Separate tensor groups are created for each enc.blk.[bid], additional tensor groups are created for remaining non-blk enc and dec tensors. Example:

## Tensors Overview ~61M Elements

Total number of elements in all tensors: 60506880 Elements

- [Decoder Block 0 Tensor Group - ~4M Elements](#dec_blk_0)
- [Decoder Block 1 Tensor Group - ~4M Elements](#dec_blk_1)
- [Decoder Block 2 Tensor Group - ~4M Elements](#dec_blk_2)
- [Decoder Block 3 Tensor Group - ~4M Elements](#dec_blk_3)
- [Decoder Block 4 Tensor Group - ~4M Elements](#dec_blk_4)
- [Decoder Block 5 Tensor Group - ~4M Elements](#dec_blk_5)
- [Decoder Tensor Group - 512 Elements](#dec)
- [Encoder Block 0 Tensor Group - ~3M Elements](#enc_blk_0)
- [Encoder Block 1 Tensor Group - ~3M Elements](#enc_blk_1)
- [Encoder Block 2 Tensor Group - ~3M Elements](#enc_blk_2)
- [Encoder Block 3 Tensor Group - ~3M Elements](#enc_blk_3)
- [Encoder Block 4 Tensor Group - ~3M Elements](#enc_blk_4)
- [Encoder Block 5 Tensor Group - ~3M Elements](#enc_blk_5)
- [Encoder Tensor Group - 512 Elements](#enc)
- [Base Tensor Group - ~16M Elements](#base)

In attached file there is a complete output from python3 gguf-py/scripts/gguf-dump.py --markdown /mnt/md0/models/t5-small.gguf command.