IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
307 stars 134 forks source link

add dpk_connector to dpk #637

Closed hmtbr closed 1 month ago

hmtbr commented 1 month ago

Why are these changes needed?

We want to cover the full life cycle for data acquisition and data processing in DPK. This PR will add the core library to perform data acquisition from websites.

Related issue number (if any).

633

touma-I commented 1 month ago

Can this be scaled up by implementing it as a transform.

No. The plan is to release this as an open source package from which we can create additional transform. @daw3rd please remove your request for change. Thanks

hmtbr commented 1 month ago

@daw3rd Thank you for reviewing this. I added overview.md to describe example usage. Could you take a look at the documentation? Let me know if I need additional updates.

hmtbr commented 1 month ago

In order to make the repo feel a bit more cohesive, should we rename this top-level directory to data-connector-lib to align with data-processing-lib?

Renamed the top level folder to data-connector-lib.

hmtbr commented 1 month ago

Thank you @daw3rd !