IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
288 stars 128 forks source link

fdedup ( fuzzy dedup ) is not available to install with new install method #768

Open santoshborse opened 1 week ago

santoshborse commented 1 week ago

Search before asking

Component

Transforms/universal/fdedup

What happened + What you expected to happen

!pip install data-prep-toolkit-transforms[fdedup]==0.2.2.dev2 says fdedup does not exists, also !pip install data-prep-toolkit-transforms[all]==0.2.2.dev2 does not install fdedup.

The reason it is missing because it is not listed here in all - https://github.com/IBM/data-prep-kit/blob/dev/transforms/pyproject.toml#L25

Reproduction script

!pip install data-prep-toolkit-transforms[fdedup]==0.2.2.dev2

Anything else

No response

OS

Ubuntu

Python

3.10.x

Are you willing to submit a PR?

touma-I commented 1 week ago

@santoshborse, Can you submit a PR to fix this ? I saw you did not check the box above just want to make sure you will do it. thanks