huggingface / tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
https://huggingface.co/docs/tokenizers
Apache License 2.0
8.69k stars 747 forks source link

Rust tokenizer fails! #1398

Closed arunpatro closed 6 months ago

arunpatro commented 7 months ago

I installed tokenizers 0.15.0 and the this code fails:

use tokenizers::tokenizer::{Result, Tokenizer};

fn main() -> Result<()> {
        let tokenizer = Tokenizer::from_pretrained("bert-base-cased", None)?;

        let encoding = tokenizer.encode("Hey there!", false)?;
        println!("{:?}", encoding.get_tokens());
    Ok(())
}
  --> src/main.rs:31:36
   |
31 |         let tokenizer = Tokenizer::from_pretrained("bert-base-cased", None)?;
   |                                    ^^^^^^^^^^^^^^^
   |                                    |
   |                                    function or associated item not found in `Tokenizer`
   |                                    help: there is an associated function with a similar name: `from_file`

This is mentioned in the docs, how do I resolve this?

ArthurZucker commented 7 months ago

This function depends on #[cfg(feature = "http")], there were no changes related to this in the latest release (a dry project without enabling http also fails on 0.14.1. The following should be added to the Cargo.toml:

[dependencies]
tokenizers = { version = "0.15.0", features = ["http"] }

this works for me

github-actions[bot] commented 6 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.