code-kern-ai / refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
https://www.kern.ai
Apache License 2.0
1.38k stars 65 forks source link

Add further requirements to execution environments #166

Closed FelixKirsch closed 4 months ago

FelixKirsch commented 1 year ago

Is your feature request related to a problem? Please describe. I want to import the following libraries in the execution environments:

langdetect==1.0.9
phonenumbers==8.13.0
python-levenshtein==0.20.8
textblob==0.17.1
textstat==0.7.3

Describe the solution you'd like Add the requirements to the exec-env parent image

jhoetter commented 1 year ago
langdetect==1.0.9
nltk==3.7
phonenumbers==8.13.0
python-levenshtein==0.20.8
textblob==0.17.1
textstat==0.7.3
translate==3.6.1
spacy==3.4.2
quantulum3==0.7.11
LeXmo==0.1.4
LeonardPuettmann commented 1 year ago
fuzzywuzzy==0.18.0
langdetect==1.0.9
nltk==3.7
phonenumbers==8.13.0
python-levenshtein==0.20.8
textblob==0.17.1
textstat==0.7.3
translate==3.6.1
spacy==3.4.2
quantulum3==0.7.11
LeXmo==0.1.4
JWittmeyer commented 1 year ago

current requirements included. for new one reopen the issue

divyanshukatiyar commented 1 year ago
better_profanity==0.7.0
flashtext==2.7
LeonardPuettmann commented 1 year ago

stemming==1.0.1

JWittmeyer commented 1 year ago

nltk.downloader words stopwords wordnet omw-1.4 brown punkt

JWittmeyer commented 1 year ago

current requirements included in base image v1.7.0. for new one reopen the issue

LeonardPuettmann commented 1 year ago

openai==0.25.0

LeonardPuettmann commented 1 year ago
bayesian-optimization==1.4.2
google-search-results==2.4.1
vaderSentiment==3.3.2
JWittmeyer commented 1 year ago

included in image 1.8.0 reopen with new requirements

LeonardPuettmannKern commented 1 year ago

textacy==0.12.0

JWittmeyer commented 1 year ago

included in image 1.8.1 reopen with new requirements

divyanshu404 commented 1 year ago

scikit-optimize==0.9.0

JWittmeyer commented 1 year ago

included in exec env image 1.8.2 reopen with new requirements

JWittmeyer commented 1 year ago

holidays==0.21.13

JWittmeyer commented 1 year ago

sumy==0.11.0

LeonardPuettmannKern commented 1 year ago

Please update OpenAI from openai==0.25.0 to openai==0.27.7. Otherwise, GPT-3.5-Turbo and GPT-4 won't be usable in refinery. Thanks! :)

FelixKirschKern commented 1 year ago

included in exec env image 1.11.0 reopen with new requirements

SvenjaKern commented 1 year ago

Please include knowledge==0.3

SvenjaKern commented 10 months ago

PyPDF2==3.0.1

FelixKirschKern commented 10 months ago

@SvenjaKern I added pypdf==3.15.5 instead. pypdf and pypdf2 have the same maintainer and he wants the community to switch to pypdf (https://stackoverflow.com/questions/63199763/maintained-alternatives-to-pypdf2, first answer). He states that both should be similar. PyPDF2 also raised a dependabot alert, which also suggested the switch to pypdf as fix.

Please check if the new brick also works with pypdf. If that's the case, you can simply close the issue. Otherwise, I will add pypdf2 instead.

LeonardPuettmannKern commented 9 months ago

@FelixKirschKern Please add tiktoken==0.4.0 and remove pypdf==3.15.5 if it was added, as it's not needed. Thanks!

JWittmeyer commented 9 months ago

also increase psycopg2-binary version to psycopg2-binary==2.9.7

and urllib3 to 1.26.17 where possible

FelixKirschKern commented 9 months ago

Added tiktoken==0.4.0 and removed pypdf==3.15.5, updated urllib3

JWittmeyer commented 8 months ago

increase psycopg2-binary version to psycopg2-binary==2.9.7 (or maybe even higher) add spacy ja support to exec env (add dockerfile run line for: RUN pip3 install spacy[ja]) - also for gateway but probably not base image

FelixKirschKern commented 4 months ago

Updated psycopg2-binary and added language support for japanese