modelscope / data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Apache License 2.0
2.95k stars 178 forks source link

How to use ‘hf_model’ #457

Open abchbx opened 3 weeks ago

abchbx commented 3 weeks ago

Before Asking 在提问之前

Search before asking 先搜索,再提问

Question

我尝试使用'hf_model',但是失败了,它告诉我下载错误,该如何解决?

错误提示:OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like Qwen/Qwen2.5-1.5B-Instruct is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

Additional 额外信息

No response

HYLcool commented 2 weeks ago

嗨 @abchbx ,感谢你的关注与使用~

由于该模型来源于huggingface hub,因此在运行该模型时需要从huggingface下载,由于其在外网,因此可能需要科学上网,还请自行查阅相关资料

或者你可以查找huggingface可以访问到的镜像站,从上面手动下载模型后,将hf_model改为本地模型的目录即可

abchbx commented 2 weeks ago

感谢您的回答,但是我设置了镜像站HF-mirror 但是还是无效。如果手动下载的话,参数需要如何填写?

HYLcool commented 2 weeks ago

感谢您的回答,但是我设置了镜像站HF-mirror 但是还是无效。如果手动下载的话,参数需要如何填写?

例如你下载这个模型到你本地的/root/models/Qwen2.5-1.5B-Instruct目录,只需要将配置文件中hf_model参数替换为该目录的路径即可:

- generate_instruction_mapper:
      hf_model: '/root/models/Qwen2.5-1.5B-Instruct'
      ...