IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
316 stars 135 forks source link

Update all transforms to use single package library with [extra] #735

Closed touma-I closed 4 weeks ago

touma-I commented 1 month ago

Why are these changes needed?

Remove pyproject.toml for both ray and python and update makefiles and dependencies

Related issue number (if any).

revit13 commented 1 month ago

LGTM

revit13 commented 1 month ago

Minor comment: Please consider renaming WHEEL_FILE_NAME to DPK_WHEEL_FILE_NAME

touma-I commented 4 weeks ago

Minor comment: Please consider renaming WHEEL_FILE_NAME to DPK_WHEEL_FILE_NAME

Thanks @revit13 Good catch! Done.

touma-I commented 4 weeks ago

@touma-I to create issues on

  1. USE_REPO_LIB_SRC support around wheel build/use
  2. HAP .gitignore a kfp
  3. try to address caching of library wheels for transform images
  4. Address spark base image build (download) time.

Thanks @daw3rd. Greatly appreciate how you keep me out of trouble :-)