modelscope / data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Apache License 2.0
2.9k stars 175 forks source link

[Bug]: Failed building wheel for simhash-py #130

Closed ZengJin123 closed 9 months ago

ZengJin123 commented 11 months ago

Before Reporting 报告之前

Search before reporting 先搜索,再报告

OS 系统

ubunto

Installation Method 安装方式

git clone pip

Data-Juicer Version Data-Juicer版本

最新版本

Python Version Python版本

3.8.13

Describe the bug 描述这个bug

安装pip install -e [.tools]是出现

image

To Reproduce 如何复现

pip install -e [.tools]

Configs 配置信息

No response

Logs 报错日志

No response

Screenshots 截图

No response

Additional 额外信息

No response

zhijianma commented 11 months ago

您好,感谢使用Data-Juicer。 从您的截图中发现您的python 版本为3.11。 在这个版本下, longintrepr.h 文件被移动到了cython 子目录。 建议您安装并切换到正确的python3.8的版本。 另外我们后期会发布与python 版本无关的simhash,敬请关注。

github-actions[bot] commented 10 months ago

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

HYLcool commented 10 months ago

@ZengJin123 你好,目前我们已经发布了python版本兼容性更好的simhash库,名称为simhash-pybind,在最新分支中也已经将simhash的依赖修改为了这一个库,请您拉取最新代码再尝试一下,或者在卸载了已安装的simhash-py库后手动安装

pip uninstall simhash-py
pip install simhash-pybind
github-actions[bot] commented 9 months ago

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

github-actions[bot] commented 9 months ago

Close this stale issue.