issues
search
IBM
/
data-prep-kit
Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
307
stars
134
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
bump connector version
#769
hmtbr
closed
2 weeks ago
0
fdedup ( fuzzy dedup ) is not available to install with new install method
#768
santoshborse
opened
2 weeks ago
1
Problem with converting pdf files in the intro example when using release 0.2.2.dev2
#767
shahrokhDaijavad
closed
1 week ago
5
Bump streamlit from 1.36.0 to 1.37.0 in /transforms/code/code_profiler/python
#766
dependabot[bot]
closed
2 weeks ago
0
Create new dev2 pre-releases for both tansforms and library with latest from docling 2.0
#765
touma-I
closed
2 weeks ago
1
[Bug] CICD Workflow test-image fails on repo_level_order
#764
touma-I
opened
3 weeks ago
1
Update RAG and Intro examples to use release 0.2.2.dev2 (after the pypi release)
#763
shahrokhDaijavad
opened
3 weeks ago
1
Create new dev2 releases for both tansforms and library
#762
touma-I
closed
3 weeks ago
0
[Feature] Create new pre-release 0.2.2.dev2 with recent code for the transformers
#761
touma-I
closed
1 week ago
4
[Bug] error installing 'data-prep-toolkit-transforms[ray]==0.2.2.dev1' using testpypi
#759
sujee
closed
2 weeks ago
2
fix multilock with default parameters
#757
dolfim-ibm
closed
3 weeks ago
0
Update pdf2parquet to Docling v2
#756
dolfim-ibm
closed
3 weeks ago
0
Update resources.md
#755
shahrokhDaijavad
closed
3 weeks ago
0
Template for single transform notebook examples
#754
shahrokhDaijavad
opened
3 weeks ago
25
Uniform documentation and example Notebooks for all transforms!
#753
shahrokhDaijavad
opened
3 weeks ago
15
Update README file by adding links to the RAG and fine-tuning examples.
#752
shahrokhDaijavad
closed
2 days ago
1
Build a new transform to automate crawling and then convert to parquet
#751
shahrokhDaijavad
closed
2 days ago
8
Add the IBM Developer blog and Discord channel link to the Resources.md page
#750
shahrokhDaijavad
closed
3 weeks ago
0
Fixing code sample-notebook
#749
santoshborse
closed
2 weeks ago
0
Update sample-notebook.ipynb
#748
santoshborse
closed
3 weeks ago
0
Increase recursion limit and add error handling for deep recursion of…
#747
pankajskku
closed
1 week ago
1
Added exception handling to code quality transform
#746
Param-S
opened
3 weeks ago
0
updating RAG example to use IBM granite model
#745
sujee
closed
3 weeks ago
4
fixed URLs and fixed ray download error
#744
sujee
closed
3 weeks ago
2
Fix Dockerfile users for dpk, ray and spark
#743
touma-I
closed
3 weeks ago
0
Investigate issue with hap failure
#742
touma-I
closed
4 weeks ago
1
Add a notebook demonstrating the use of DPK connector for RAG
#740
Qiragg
opened
4 weeks ago
4
[Feature] Example notebook for DPK connector
#739
Qiragg
closed
5 days ago
2
allow the user to customize crawler settings
#738
hmtbr
closed
3 weeks ago
6
[Feature] [Connector] Customization of the crawler settings
#737
hmtbr
closed
3 weeks ago
0
[Bug] Add data-connector-lib to the make directory
#736
touma-I
opened
4 weeks ago
0
Update all transforms to use single package library with [extra]
#735
touma-I
closed
3 weeks ago
4
[Bug] HAP kfp test failing
#734
touma-I
opened
4 weeks ago
3
Update release number following release cutoff for http connector
#733
touma-I
closed
4 weeks ago
0
New and first release cut for connector library
#732
touma-I
closed
1 month ago
0
[Bug] Remove/fix individual pyproject.toml in data-processing-lib/python and data-process-lib/ray
#731
touma-I
opened
1 month ago
0
Update README.md of the intro example for the typo
#730
shahrokhDaijavad
closed
1 month ago
0
[Feature] [Connector] Support path focus with domain/subdomain focus
#729
hmtbr
opened
1 month ago
0
docs: update README.md
#728
eltociear
closed
1 month ago
0
fix link to pdf2parquet readme.md
#727
touma-I
closed
1 month ago
0
Multiple fixes for semantic order transform
#726
shivdeep-singh-ibm
closed
1 month ago
0
implement subdomain focus feature in data-prep-connector
#725
hmtbr
closed
1 month ago
9
[Feature] [Connector] Apply subdomain focus based on the seed url
#724
hmtbr
closed
1 month ago
2
Update Docling to 1.20.0
#723
dolfim-ibm
closed
1 month ago
3
[Bug] one of the created Ray actors die during docid transform
#722
sujee
opened
1 month ago
1
Fix metadata logging even when actors crash
#721
shivdeep-singh-ibm
closed
3 weeks ago
0
Fix 'IndexError: list index out of range' in header_cleanser
#720
takuyagt
closed
1 month ago
0
[Bug] docid ray transformation errors when running on colab (release 0.2.2dev1)
#719
sujee
opened
1 month ago
5
Intro example 1
#718
sujee
closed
1 month ago
5
Update README.md by adding html2parquet KFP and Code_pofiler transform to the table
#717
shahrokhDaijavad
opened
1 month ago
0
Previous
Next