[feature request] Add metadata to models (like VRAM, download size etc.)

jhc13 / taggui

Tag manager and captioner for image datasets

GNU General Public License v3.0

517 stars 27 forks source link

[feature request] Add metadata to models (like VRAM, download size etc.) #161

Open geroldmeisinger opened 1 month ago

geroldmeisinger commented 1 month ago

hello and thank you for great tool. it would be nice to know some (approximate) metadata about models before downloading to avoid downloading huge files which in the end won't work:

in ~/.cache/huggingface/hub:

models--THUDM--cogagent-vqa-hf  36.6GB >=12GB VRAM (4bit)
models--THUDM--cogvlm-chat-hf 35.3GB >=12GB VRAM (4bit)
(+ models--lmsys--vicuna-7b-v1.5 1MB)

system: Linux, RTX 3060 12GB

btw i have to set export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

(I also document them here: https://github.com/jhc13/taggui/discussions/169)

geroldmeisinger commented 1 month ago

THUDM/cogvlm2-llama3-chat-19B-int4 size: 12.5 GB VRAM: 14868MiB

system: Linux, RTX 4060 Ti 16GB

jhc13 commented 1 month ago

The VRAM usage is different based on the system and settings used.

geroldmeisinger commented 1 month ago

different how? can we get a range? or minimum requirements? or sensible categories? to me it would be important to know if there is ANY chance to run the model, even if it is very slow.

yggdrasil75 commented 1 month ago

different oses can change vram usage different cuda versions can change vram usage. different driver versions can as well amd cards will use different vram as well vs nvidia same with intel gpus.

if this was built to only run on 1 platform, 1 driver version, 1 cuda version, etc then it would be easy to track.

geroldmeisinger commented 1 month ago

which still leaves the "minimum requirement"

geroldmeisinger commented 1 month ago

different oses can change vram usage different cuda versions can change vram usage. different driver versions can as well amd cards will use different vram as well vs nvidia same with intel gpus.

if this was built to only run on 1 platform, 1 driver version, 1 cuda version, etc then it would be easy to track.

thanks for the explanation!