A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
[X] I have read the README carefully. 我已经仔细阅读了 README 上的操作指引。
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking 先搜索,再提问
[X] I have searched the Data-Juicer issues and found no similar questions. 我已经在 issue列表 中搜索但是没有发现类似的问题。
Question
pip install py-data-juicer安装失败
` Cython.Compiler.Errors.CompileError: simhash/simhash.pyx
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for simhash-py
Running setup.py clean for simhash-py
Failed to build kenlm simhash-py
ERROR: Could not build wheels for kenlm, simhash-py, which is required to install pyproject.toml-based projects`
windows11系统,已经安装gcc, cmake, visual studio 2022 MSVC140/143生成工具,我不太清楚是不是因为gcc和cmake版本不对,导致kenlm, simhash-py报错?
gcc (x86_64-win32-seh-rev3, Built by MinGW-W64 project) 12.1.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Before Asking 在提问之前
[X] I have read the README carefully. 我已经仔细阅读了 README 上的操作指引。
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking 先搜索,再提问
Question
pip install py-data-juicer安装失败
` Cython.Compiler.Errors.CompileError: simhash/simhash.pyx [end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for simhash-py Running setup.py clean for simhash-py Failed to build kenlm simhash-py ERROR: Could not build wheels for kenlm, simhash-py, which is required to install pyproject.toml-based projects`
全部报错信息见全部报错
Additional 额外信息