celll1 / tagmane

Apache License 2.0
1 stars 1 forks source link

A CPU benchmark, and request for running VLM on GPU #1

Open tofu-bar opened 3 hours ago

tofu-bar commented 3 hours ago

I was trying to find a suite batch size of CPU inference for my PC. I could run 1-10 batches, but I hope GPU could run it much more and faster... What do you think?

I used the default model, which should be a bit heavy model: wd-eval02-large-tagger-v3

Here's some benchmark result of CPU utilization of batch size 1-4, 10, and idle. I couldn't run 100. Runtime was like 1:100%, 2: 80%, 3:60%, 4:60%, 10: 180%

1 スクリーンショット 2024-10-20 004906 2 スクリーンショット 2024-10-20 004714 3 スクリーンショット 2024-10-20 004515 4 スクリーンショット 2024-10-20 005230 10 スクリーンショット 2024-10-20 005409 0 スクリーンショット 2024-10-20 005537
celll1 commented 2 hours ago

I'm including batch settings for future inference using DirectML, but it has not been implemented yet. I will work on it after the code is refactored.