neomatrix369 / nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Other
241 stars 37 forks source link

Cant install from pip (from conda environment) #57

Closed tyoc213 closed 3 years ago

tyoc213 commented 3 years ago

pip install nlp_profiler shows this

$ pip install nlp_profiler
Collecting nlp_profiler
  Using cached nlp_profiler-0.0.2-py2.py3-none-any.whl (39 kB)
Requirement already satisfied: nltk>=3.5 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (3.5)
Requirement already satisfied: tqdm>=4.46.0 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (4.48.2)
Requirement already satisfied: requests>=2.23.0 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (2.24.0)
Requirement already satisfied: ipython>=7.12.0 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (7.18.1)
Collecting language-tool-python>=2.3.1
  Using cached language_tool_python-2.4.7-py3-none-any.whl (30 kB)
Requirement already satisfied: pandas in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (1.1.1)
Collecting swifter>=1.0.3
  Using cached swifter-1.0.7.tar.gz (633 kB)
Collecting textblob>=0.15.3
  Using cached textblob-0.15.3-py2.py3-none-any.whl (636 kB)
ERROR: Could not find a version that satisfies the requirement en-core-web-sm (from nlp_profiler) (from versions: none)
ERROR: No matching distribution found for en-core-web-sm (from nlp_profiler)

To Reproduce There is no dataframe to share because it cant be installed.

Version information:

Version information is essential in reproducing and resolving bugs. Please report:

Or with conda:

$ conda list --export 
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
_libgcc_mutex=0.1=main
absl-py=0.11.0=pypi_0
adal=1.2.4=pypi_0
alabaster=0.7.12=pypi_0
appdirs=1.4.4=pypi_0
argon2-cffi=20.1.0=pypi_0
astroid=2.4.2=py38_0
asttokens=2.0.4=pypi_0
attrs=20.1.0=pypi_0
audioread=2.1.8=pypi_0
azure-cognitiveservices-search-imagesearch=2.0.0=pypi_0
azure-common=1.1.25=pypi_0
babel=2.8.0=pypi_0
backcall=0.2.0=pypi_0
birdseye=0.8.4=pypi_0
black=20.8b1=pypi_0
blas=1.0=mkl
bleach=3.1.5=pypi_0
blis=0.4.1=pypi_0
ca-certificates=2020.10.14=0
cached-property=1.5.2=pypi_0
catalogue=1.0.0=pypi_0
certifi=2020.6.20=pyhd3eb1b0_3
cffi=1.14.2=pypi_0
cfgv=3.2.0=pypi_0
chardet=3.0.4=pypi_0
cheap-repr=0.4.4=pypi_0
click=7.1.2=pypi_0
click-plugins=1.1.1=pypi_0
cligj=0.7.1=pypi_0
colorednoise=1.1.1=pypi_0
commonmark=0.9.1=pypi_0
configparser=5.0.1=pypi_0
coverage=5.2.1=pypi_0
cryptography=3.1=pypi_0
cudatoolkit=11.0.221=h6bb024c_0
cycler=0.10.0=pypi_0
cymem=2.0.3=pypi_0
dataclasses=0.6=pypi_0
decorator=4.4.2=pypi_0
defusedxml=0.6.0=pypi_0
dill=0.3.3=pypi_0
distlib=0.3.1=pypi_0
docker-pycreds=0.4.0=pypi_0
docutils=0.16=pypi_0
einops=0.3.0=pypi_0
entrypoints=0.3=pypi_0
executing=0.5.3=pypi_0
fastai=2.1.8=dev_0
fastai-xla-extensions=0.0.1=dev_0
fastbook=0.0.11=pypi_0
fastcore=1.3.11=dev_0
fastprogress=1.0.0=pypi_0
fastscript=1.0.0=pypi_0
filelock=3.0.12=pypi_0
fiona=1.8.18=pypi_0
flask=1.1.2=pypi_0
flask-humanize=0.3.0=pypi_0
freetype=2.10.2=h5ab3b9f_0
future=0.18.2=pypi_0
geopandas=0.8.1=pypi_0
gitdb=4.0.5=pypi_0
gitpython=3.1.11=pypi_0
heartrate=0.2.1=pypi_0
humanize=3.1.0=pypi_0
identify=1.4.30=pypi_0
idna=2.10=pypi_0
imagesize=1.2.0=pypi_0
iniconfig=1.0.1=pypi_0
intel-openmp=2020.2=254
ipykernel=5.3.4=pypi_0
ipython=7.18.1=pypi_0
ipython-genutils=0.2.0=pypi_0
ipywidgets=7.5.1=pypi_0
isodate=0.6.0=pypi_0
isort=5.4.2=py38_0
itsdangerous=1.1.0=pypi_0
jedi=0.17.2=pypi_0
jinja2=2.11.2=pypi_0
joblib=0.16.0=pypi_0
jpeg=9b=h024ee3a_2
jsonschema=3.2.0=pypi_0
jupyter=1.0.0=pypi_0
jupyter-client=6.1.7=pypi_0
jupyter-console=6.2.0=pypi_0
jupyter-core=4.6.3=pypi_0
jupyter-notebook-gist=0.5.0=pypi_0
kaggle=1.5.9=pypi_0
kiwisolver=1.2.0=pypi_0
lazy-object-proxy=1.4.3=py38h7b6447c_0
lcms2=2.11=h396b838_0
ld_impl_linux-64=2.33.1=h53a641e_7
libedit=3.1.20191231=h14c3975_1
libffi=3.3=he6710b0_2
libgcc-ng=9.1.0=hdf63c60_0
libpng=1.6.37=hbc83047_0
librosa=0.8.0=pypi_0
libstdcxx-ng=9.1.0=hdf63c60_0
libtiff=4.1.0=h2733197_1
libuv=1.40.0=h7b6447c_0
littleutils=0.2.2=pypi_0
livereload=2.6.3=pypi_0
llvmlite=0.34.0=pypi_0
lunr=0.5.8=pypi_0
lz4-c=1.9.2=he6710b0_1
markdown=3.2.2=pypi_0
markupsafe=1.1.1=pypi_0
matplotlib=3.3.1=pypi_0
mccabe=0.6.1=py38_1
memory-profiler=0.58.0=pypi_0
mir-eval=0.6=pypi_0
mistune=0.8.4=pypi_0
mkautodoc=0.1.0=pypi_0
mkdocs=1.1.2=pypi_0
mkdocs-material=5.5.12=pypi_0
mkdocs-material-extensions=1.0=pypi_0
mkl=2020.2=256
mkl-service=2.3.0=py38he904b0f_0
mkl_fft=1.1.0=py38h23d657b_0
mkl_random=1.1.1=py38h0573a6f_0
mknotebooks=0.4.1=pypi_0
more-itertools=8.5.0=pypi_0
msrest=0.6.19=pypi_0
msrestazure=0.6.4=pypi_0
munch=2.5.0=pypi_0
murmurhash=1.0.2=pypi_0
mypy-extensions=0.4.3=pypi_0
nbconvert=5.6.1=pypi_0
nbdev=1.1.6=dev_0
nbformat=5.0.7=pypi_0
ncurses=6.2=he6710b0_1
ninja=1.10.0=py38hfd86e86_0
nlp=0.4.0=pypi_0
nltk=3.5=pypi_0
nodeenv=1.5.0=pypi_0
notebook=6.1.3=pypi_0
numba=0.51.2=pypi_0
numexpr=2.7.1=pypi_0
numpy=1.19.1=py38hbc911f0_0
numpy-base=1.19.1=py38hfa32c7d_0
oauthlib=3.1.0=pypi_0
ohmeow-blurr=0.0.18=pypi_0
olefile=0.46=py_0
openssl=1.1.1h=h7b6447c_0
outdated=0.2.0=pypi_0
packaging=20.4=pypi_0
pandas=1.1.1=pypi_0
pandocfilters=1.4.2=pypi_0
parso=0.7.1=pypi_0
pathspec=0.8.0=pypi_0
pexpect=4.8.0=pypi_0
pickleshare=0.7.5=pypi_0
pillow=7.2.0=py38hb39fc2d_0
pip=20.2.2=py38_0
plac=1.1.3=pypi_0
pluggy=0.13.1=pypi_0
pooch=1.1.1=pypi_0
pre-commit=2.7.1=pypi_0
preshed=3.0.2=pypi_0
prometheus-client=0.8.0=pypi_0
promise=2.3=pypi_0
prompt-toolkit=3.0.7=pypi_0
protobuf=3.14.0=pypi_0
psutil=5.7.3=pypi_0
ptyprocess=0.6.0=pypi_0
py=1.9.0=pypi_0
pyarrow=2.0.0=pypi_0
pycparser=2.20=pypi_0
pygments=2.6.1=pypi_0
pyjwt=1.7.1=pypi_0
pylint=2.6.0=py38_0
pymdown-extensions=8.0=pypi_0
pympler=0.9=py_0
pyparsing=2.4.7=pypi_0
pyproj=3.0.0.post1=pypi_0
pyrsistent=0.16.0=pypi_0
pytest=6.0.1=pypi_0
pytest-cov=2.10.1=pypi_0
python=3.8.5=hcff3b4d_0
python-dateutil=2.8.1=pypi_0
python-graphviz=0.14.1=pypi_0
python-slugify=4.0.1=pypi_0
pytorch=1.7.0=py3.8_cuda11.0.221_cudnn8.0.3_0
pytorchvis=0.0.4=pypi_0
pytz=2020.1=pypi_0
pyyaml=5.3.1=pypi_0
pyzmq=19.0.2=pypi_0
qtconsole=4.7.6=pypi_0
qtpy=1.9.0=pypi_0
readline=8.0=h7b6447c_0
recommonmark=0.6.0=pypi_0
regex=2020.7.14=pypi_0
requests=2.24.0=pypi_0
requests-oauthlib=1.3.0=pypi_0
resampy=0.2.2=pypi_0
rouge-score=0.0.4=pypi_0
sacremoses=0.0.43=pypi_0
scikit-learn=0.23.2=pypi_0
scipy=1.5.2=pypi_0
seaborn=0.11.0=pypi_0
send2trash=1.5.0=pypi_0
sentencepiece=0.1.91=pypi_0
sentry-sdk=0.19.5=pypi_0
seqeval=1.2.2=pypi_0
setuptools=49.6.0=py38_0
shapely=1.7.1=pypi_0
shortuuid=1.0.1=pypi_0
six=1.15.0=py_0
slugify=0.0.1=pypi_0
smmap=3.0.4=pypi_0
snoop=0.2.5=pypi_0
snowballstemmer=2.0.0=pypi_0
soundfile=0.10.3.post1=pypi_0
spacy=2.3.2=pypi_0
sphinx=3.2.1=pypi_0
sphinxcontrib-applehelp=1.0.2=pypi_0
sphinxcontrib-devhelp=1.0.2=pypi_0
sphinxcontrib-htmlhelp=1.0.3=pypi_0
sphinxcontrib-jsmath=1.0.1=pypi_0
sphinxcontrib-qthelp=1.0.3=pypi_0
sphinxcontrib-serializinghtml=1.1.4=pypi_0
sqlalchemy=1.3.20=pypi_0
sqlite=3.33.0=h62c20be_0
srsly=1.0.2=pypi_0
subprocess32=3.5.4=pypi_0
tables=3.6.1=pypi_0
tensor-sensor=0.1.1=pypi_0
terminado=0.8.3=pypi_0
testpath=0.4.4=pypi_0
text-unidecode=1.3=pypi_0
thinc=7.4.1=pypi_0
threadpoolctl=2.1.0=pypi_0
tk=8.6.10=hbc83047_0
tokenizers=0.9.3=pypi_0
toml=0.10.1=py_0
torchaudio=0.6.0=pypi_0
torchvision=0.8.1=py38_cu110
torchviz=0.0.1=pypi_0
tornado=6.0.4=pypi_0
tqdm=4.48.2=pypi_0
traitlets=5.0.0=pypi_0
transformers=3.5.1=pypi_0
typed-ast=1.4.1=pypi_0
typing_extensions=3.7.4.3=py_0
urllib3=1.25.10=pypi_0
virtualenv=20.0.31=pypi_0
wandb=0.10.12=pypi_0
wasabi=0.8.0=pypi_0
watchdog=1.0.1=pypi_0
wcwidth=0.2.5=pypi_0
webencodings=0.5.1=pypi_0
werkzeug=1.0.1=pypi_0
wheel=0.35.1=py_0
widgetsnbextension=3.5.1=pypi_0
wrapt=1.11.2=py38h7b6447c_0
xxhash=2.0.0=pypi_0
xz=5.2.5=h7b6447c_0
zlib=1.2.11=h7b6447c_3
zstd=1.4.5=h9ceee32_0

image

neomatrix369 commented 3 years ago

Thanks @tyoc213 for reporting this, are you running this on your local machine? Have you had a look at the notebooks mentioned here? Will these help in the meanwhile I take a look at this issue.

neomatrix369 commented 3 years ago

Can you try doing this on your CLI:

pip install en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.0/en_core_web_sm-2.3.0.tar.gz

and then

pip install -U nlp_profiler

My hunch is you have spacy==2.3.2 while the nlp_profiler' requirements.txt file is trying to install en_core_web_sm-2.3.0 - I will try to find another way to fix this

neomatrix369 commented 3 years ago

The other option would be

python -m spacy download en_core_web_sm

as mentioned here

neomatrix369 commented 3 years ago

@tyoc213 I found an option for conda/miniconda users:

conda config --set pip_interop_enabled True    # tested this on a different package and it worked
pip install -U nlp_profiler

Please let me know how this works for you.

tyoc213 commented 3 years ago

Interop doesnt work, same error

$ pip install -U nlp_profiler
Collecting nlp_profiler
  Using cached nlp_profiler-0.0.2-py2.py3-none-any.whl (39 kB)
Collecting swifter>=1.0.3
  Using cached swifter-1.0.7.tar.gz (633 kB)
Collecting textblob>=0.15.3
  Using cached textblob-0.15.3-py2.py3-none-any.whl (636 kB)
Requirement already satisfied, skipping upgrade: nltk>=3.5 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (3.5)
Requirement already satisfied, skipping upgrade: requests>=2.23.0 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (2.24.0)
Requirement already satisfied, skipping upgrade: tqdm>=4.46.0 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (4.48.2)
Collecting language-tool-python>=2.3.1
  Using cached language_tool_python-2.4.7-py3-none-any.whl (30 kB)
Requirement already satisfied, skipping upgrade: joblib>=0.14.1 in /home/tyoc213/miniconda3/envs/fastai/lib/python3.8/site-packages (from nlp_profiler) (0.16.0)
ERROR: Could not find a version that satisfies the requirement en-core-web-sm (from nlp_profiler) (from versions: none)
ERROR: No matching distribution found for en-core-web-sm (from nlp_profiler)

But with python m spacy download en_core_web_sm finished, so I could install later

Successfully built emoji swifter locket gpustat nvidia-ml-py3
Installing collected packages: language-tool-python, emoji, textblob, fsspec, toolz, locket, partd, dask, pyarrow, colorama, opencensus-context, pyasn1, pyasn1-modules, rsa, cachetools, google-auth, googleapis-common-protos, google-api-core, opencensus, nvidia-ml-py3, blessings, gpustat, async-timeout, multidict, yarl, aiohttp, aiohttp-cors, py-spy, hiredis, aioredis, colorful, redis, soupsieve, beautifulsoup4, google, msgpack, grpcio, ray, modin, swifter, nlp-profiler
  Attempting uninstall: pyarrow
    Found existing installation: pyarrow 2.0.0
    Uninstalling pyarrow-2.0.0:
      Successfully uninstalled pyarrow-2.0.0
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

modin 0.8.2 requires pandas==1.1.4, but you'll have pandas 1.1.1 which is incompatible.
Successfully installed aiohttp-3.7.3 aiohttp-cors-0.7.0 aioredis-1.3.1 async-timeout-3.0.1 beautifulsoup4-4.9.3 blessings-1.7 cachetools-4.2.0 colorama-0.4.4 colorful-0.5.4 dask-2020.12.0 emoji-0.6.0 fsspec-0.8.4 google-3.0.0 google-api-core-1.23.0 google-auth-1.24.0 googleapis-common-protos-1.52.0 gpustat-0.6.0 grpcio-1.34.0 hiredis-1.1.0 language-tool-python-2.4.7 locket-0.2.0 modin-0.8.2 msgpack-1.0.1 multidict-5.1.0 nlp-profiler-0.0.2 nvidia-ml-py3-7.352.0 opencensus-0.7.11 opencensus-context-0.1.2 partd-1.1.0 py-spy-0.3.3 pyarrow-1.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 ray-1.0.1.post1 redis-3.4.1 rsa-4.6 soupsieve-2.1 swifter-1.0.7 textblob-0.15.3 toolz-0.11.1 yarl-1.6.3
neomatrix369 commented 3 years ago

Thanks @tyoc213 I will update the README/Docs so conda users know what to do

neomatrix369 commented 3 years ago

You mean:

python -m spacy download en_core_web_sm

I can see now why installing is failing for you. (or other conda users) - although this has been taken care in general for conda users I might have to help this with the docs.