IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
321 stars 135 forks source link

[Bug] unable to install release 0.2.1 on windows (native) #644

Open sujee opened 1 month ago

sujee commented 1 month ago

Search before asking

Component

Other

What happened + What you expected to happen

This issue is reported by a workshop user.

When trying to install the following

pip install data-prep-toolkit-transforms==0.2.1 data-prep-toolkit-transforms-ray==0.2.1

Getting the following error : https://gist.github.com/Alirezasedd/b9949fee563facb28aae34082517e151

Basically deepsearch-glm release not found

ERROR: Could not find a version that satisfies the requirement deepsearch-glm==0.21.0 (from data-prep-toolkit-transforms) (from versions: 0.1.0, 0.2.1, 0.2.2, 0.2.3, 0.3.0)
ERROR: No matching distribution found for deepsearch-glm==0.21.0

This is reported on windows (native install with anaconda). I am able to install it successfully on Ubuntu 24.02.

Reproduction script

pip install data-prep-toolkit-transforms==0.2.1 data-prep-toolkit-transforms-ray==0.2.1

Anything else

No response

OS

Other

Python

3.11.x

Are you willing to submit a PR?

shivdeep-singh-ibm commented 1 month ago

You can use these steps to use data-prep-kit on windows.

dataprep toolkit transforms on windows

We can run data prep transform recipes on windows if we use wsl on windows.

  1. Install wsl on windows

wsl --install Ubuntu-24.04

NOTE: Steps below can also work on linux (redhat/fedora or ubuntu).

  1. Install miniconda for python

Execute these commands on wsl or linux shell.

wget https://repo.anaconda.com/miniconda/Miniconda3-py310_24.7.1-0-Linux-x86_64.sh
chmod a+x Miniconda3-py310_24.7.1-0-Linux-x86_64.sh
./chmod a+x Miniconda3-py310_24.7.1-0-Linux-x86_64.sh

To activate conda's base environment in your current shell session:

# Activate conda for current shell
eval "$(/home/shivdeep/miniconda3/bin/conda shell.bash hook)"
  1. Create a python3.11 environment using miniconda

conda create -n data-prep-kit-1 -y python=3.11

# activate the new conda environment
conda activate data-prep-kit-1
# make sure env is swithced to data-prep-kit-1

## Check python version
python --version
# should say : 3.11

#Install jupyter lab
pip3 install jupyterlab
  1. Add gcc/g++ using miniconda

# Install c/cpp toolchains
conda install gcc_linux-64
conda install gxx_linux-64

Now the set up should be ready.

  1. Activate virtualenv
conda activate data-prep-kit-1

Then you can try your commands

pip install data-prep-toolkit-transforms==0.2.1  data-prep-toolkit-transforms-ray==0.2.1
dolfim-ibm commented 4 weeks ago

@sujee were you able to verify if this is now resolved?

sujee commented 4 weeks ago

@dolfim-ibm @shivdeep-singh-ibm

there was a confusion about this. I was asking about native windows install. Not WSL2 on windows.

I am hoping with the new docling release integration, we can install on windows natively (currently being tested)

dolfim-ibm commented 4 weeks ago

Yes, I was also speaking about native Windows support, which is available since when this PR was merged https://github.com/IBM/data-prep-kit/pull/723.