huggingface / huggingface.js

Utilities to use the Hugging Face Hub API
https://hf.co/docs/huggingface.js
MIT License
1.37k stars 213 forks source link

feat: simple heuristic for `isTensorrtEngine` #690

Open 0xSage opened 4 months ago

0xSage commented 4 months ago

Problem

isTensorrtModel Rules

  1. At least 1 file ending in .engine.
  2. (Optional) At least 1 file named config.json. Caveat: By design, model builders can actually rename this file.

Engine compatibility rules

For context, TensorRT models are specific to:

  1. GPU architectures, i.e. models compiled for Ada will only run on Ada
  2. TRT-LLM release, i.e. models compiled on release version v0.9.0 will need to run on 0.9.0
  3. OS (optional), though as of v0.9.0, models are cross OS compatible. We're still testing as it could be flaky.
  4. n GPUs, i.e. GPU topology. This can be detected by counting the # of engine files actually.

Unfortunately, afaik config.json and other metadata files do not track the hardware/build-time configurations once the models are built, so model authors will have to specify this info.

^ We'll update this info as it changes, and as we learn more 😄 .

Naming

julien-c commented 4 months ago

Hi @0xSage! I suggest only detecting the .engine files for now, they seem like the more popular format on the hub right now (450 models contain a .engine file, vs. 0 model repo contain a .plans file)

We can auto-tag those repos with a tensorrt tag, i think it'd be the easiest!