rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
https://docs.rs/llm/latest/llm/
Apache License 2.0
6.06k stars 350 forks source link

Cannot Compile llm-base on main #405

Closed carllippert closed 10 months ago

carllippert commented 10 months ago

Trying to run the base example on macOS and cannot use git main branch in cargo.

 Compiling llm-base v0.2.0-dev (https://github.com/rustformers/llm?rev=39eb341aeda6a3ff0240421e54df2707ae8743fc#39eb341a)
error[E0308]: mismatched types
   --> /Users/ccaarrll/.cargo/git/checkouts/llm-d8a8bbe144aa0546/39eb341/crates/llm-base/src/tokenizer/huggingface.rs:25:21
    |
25  |             .decode(vec![idx as u32], true)
    |              ------ ^^^^^^^^^^^^^^^^ expected `&[u32]`, found `Vec<u32>`
    |              |
    |              arguments to this method are incorrect
    |
    = note: expected reference `&[u32]`
                  found struct `Vec<u32>`
note: method defined here
   --> /Users/ccaarrll/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.13.4/src/tokenizer/mod.rs:814:12
    |
814 |     pub fn decode(&self, ids: &[u32], skip_special_tokens: bool) -> Result<String> {
    |            ^^^^^^

error[E0308]: mismatched types
   --> /Users/ccaarrll/.cargo/git/checkouts/llm-d8a8bbe144aa0546/39eb341/crates/llm-base/src/tokenizer/huggingface.rs:70:21
    |
70  |             .decode(tokens, skip_special_tokens)
    |              ------ ^^^^^^ expected `&[u32]`, found `Vec<u32>`
    |              |
    |              arguments to this method are incorrect
    |
    = note: expected reference `&[u32]`
                  found struct `Vec<u32>`
note: method defined here
   --> /Users/ccaarrll/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.13.4/src/tokenizer/mod.rs:814:12
    |
814 |     pub fn decode(&self, ids: &[u32], skip_special_tokens: bool) -> Result<String> {
    |            ^^^^^^
help: consider borrowing here
    |
70  |             .decode(&tokens, skip_special_tokens)
    |                     +

For more information about this error, try `rustc --explain E0308`.
error: could not compile `llm-base` (lib) due to 2 previous errors
warning: build failed, waiting for other jobs to finish...

project runs with this cargo setup

llm = "0.1.1"

Fails on all attempts to import "main"

tried multiple variations including importing from older revisions of main and including discluding "features"

clarkmcc commented 10 months ago

Can you try throwing this in your Cargo.toml file instead of llm = "0.1.1"? The tag you're using is pretty old and this project seems to move quickly so you might try just working off the main branch, that's what I'm doing. Plus, the main branch has the metal feature so you get GPU acceleration on macOS for basically free.

I can't tell from your message if you've tried this particular import already so sorry if you already have. I've confirmed that I'm able to build with this right now.

llm = { git = "https://github.com/rustformers/llm", branch = "main", features = ["metal"] }

Edit: scratch that. I ran cargo clean and I able to reproduce this issue. Sounds like it snuck it to the latest branch.

carllippert commented 10 months ago

I have ran about 5 rev's over historical last few weeks to see if i could get it to work

clarkmcc commented 10 months ago

It looks like tokenizers-0.13.4 changes the signature of this decode function, weird part is llm-base depends on tokenizers-0.13.3 which has the correct signature.

Just for kicks and giggles, try adding this to your Cargo.toml and see if the build works

[dependencies]
tokenizers = "=0.13.3"

If that doesn't work, what is the open the Cargo.lock file and find the version of the tokenizers crate and let me know what it is.

[[package]]
name = "tokenizers"
version = "0.13.3" # <- this guy
clarkmcc commented 10 months ago

Okay I opened a PR to fix the issue. In the meantime you can add this to your Cargo.toml

llm = { git = "https://github.com/clarkmcc/llm", branch = "upgrade-tokenizers-crate", features = ["metal"] }