Give users a way to tell if Bumblebee supports a model

elixir-nx / bumblebee

Pre-trained Neural Network models in Axon (+ 🤗 Models integration)

Apache License 2.0

1.33k stars 96 forks source link

Give users a way to tell if Bumblebee supports a model #248

Closed a-alhusaini closed 1 year ago

a-alhusaini commented 1 year ago

It is hard for beginners to tell what kind of model bumblebee could support.

I propose a feature that lets you paste the URL of a huggingface model and get a response telling you if bumblebee supports the model or not.

Here is an example of where this could be useful.

Looking at ./lib/bumblebee.ex I see that LLamaForCausalLM is a supported model architecture. But does that mean that I can run codellama?

I can't tell unless I try. Which isn't something most people would be willing to do. Having a script that tells you if a model is supported would be a huge help for beginners.

josevalim commented 1 year ago

This is a good idea, although I am not sure how feasible it is. If we can do this, there is also question if we can automate more of the process.

jonatanklosko commented 1 year ago

I think the first step is more documentation around what files usually are in a hf/transformers repository, how to check a particular model architecture and notes on tokenizers. I will work on this :)

Technically we could have a script, but there are enough edge cases that I'm not sure if it will be particularly useful. Just to name a few:

many repositories only have the "slow" tokenizer files, but in most cases it's possible to find a different repository with the corresponding "fast" tokenizer file (but we can't really tell)
Stable Diffusion is a couple models in individual directories, so the repo structure is different
some repositories (in particular Llama) require authentication to access the files, so we couldn't determine either

jonatanklosko commented 1 year ago

See #257.

I can't tell unless I try. Which isn't something most people would be willing to do.

This is exactly it, the tool will ask the user for repo and "try" for them by calling Bumblebee.load_whatever. This way we don't have any separate logic inspecting repository contents, we just rely on the loading itself and show the same error messages.

jonatanklosko commented 1 year ago

And with regard to the edge cases I listed above:

many repositories only have the "slow" tokenizer files, but in most cases it's possible to find a different repository with the corresponding "fast" tokenizer file (but we can't really tell)

The new error messages detect this and inform the user accordingly.

Stable Diffusion is a couple models in individual directories, so the repo structure is different

The script accepts an optional subdirectory, so it's less of a concern.

some repositories (in particular Llama) require authentication to access the files, so we couldn't determine either

This will fail, but the error will mention authentication and gated access, which is good enough information to start with. If they are dedicated enough to ask for access, then they can check if the model is supported in any other way, as now outlined in the README.