openvinotoolkit / openvino_tokenizers

OpenVINO Tokenizers extension
Apache License 2.0
24 stars 19 forks source link

tokenizer for SimianLuo/LCM_Dreamshaper_v7 #38

Closed yangsu2022 closed 8 months ago

yangsu2022 commented 8 months ago

Hello, I failed to convert the lcm tokenizer with convert_tokenizer SimianLuo/LCM_Dreamshaper_v7 -o output_lcm.

OSError: SimianLuo/LCM_Dreamshaper_v7 does not appear to have a file named config.json. Checkout 'https://huggingface.co/SimianLuo/LCM_Dreamshaper_v7/main' for available files.

But there is another tokenizer_config.json.

Could you take a look? Thanks

apaniukov commented 8 months ago

Hi,

The issue is that the tokenizer files are located in a tokenizer subfolder. I created the PR that will add --subfolder option to the CLI tool. Meanwhile, you can use this python code to get the tokenizer model:

from transformers import AutoTokenizer
from openvino import save_model
from openvino_tokenizers import convert_tokenizer

hf_tokenizer = AutoTokenizer.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", subfolder="tokenizer")
ov_tokenizer = convert_tokenizer(hf_tokenizer)
save_model(ov_tokenizer, "output_lcm/openvino_tokenizer.xml")