neomatrix369 / nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Other
243 stars 37 forks source link

Package is not installing on Python 3.6 #1

Closed strivedi02 closed 4 years ago

strivedi02 commented 4 years ago
Collecting git+https://github.com/neomatrix369/nlp_profiler.git@master
  Cloning https://github.com/neomatrix369/nlp_profiler.git (to revision master) to /tmp/pip-req-build-mggu48uy
Once successfully installed, please restart your Jupyter kernels for the changes to take effect
  Running command git clone -q https://github.com/neomatrix369/nlp_profiler.git /tmp/pip-req-build-mggu48uy
ERROR: Package 'nlp-profiler' requires a different Python: 3.6.9 not in '>=3.7.0'

The package requires python version to be 3.7 My local python version is 3.6.8 and colab python version is 3.6.9 Therefore am getting this error.

neomatrix369 commented 4 years ago

I put that guard for the same reason, it's not tested for Python version below 3.7.

Any chance of you testing the notebook examples on those versions of Python? And share the screenshots of it working here on the issue?

strivedi02 commented 4 years ago

Screenshot from 2020-09-12 19-15-37 actually that error I got both in google colab as well as on my local machine which are running on python version 3.6.9 and 3.6.8 respectively. And yeah I tried to install the package after changing the python version in setup.py and it got installed but when I was using the package in my work for 1 hour It kept executing this below line but I didn't got the output text_nlp = pd.DataFrame(df, columns=['text']) My data contained approx 10k rows am not sure if this is the usual behaviour and in future a progress bar could help in understanding such behaviour

neomatrix369 commented 4 years ago

actually that error I got both in google colab as well as on my local machine which are running on python version 3.6.9 and 3.6.8 respectively.

I meant to test it out by cloning the repo and setting the guard to 3.6 and then run the notebooks under the examples section. I hope you do understand why I set the limit and why I'm requesting the above - just like it will help you have it run on your version of Python it will help me by knowing it works on that version.

@strivedi02 about 10k records and progress bar - this is good feedback but as mentioned on the README:

Note: this is a new endeavour and it's probably NOT capable of doing many things yet, including running at scale. Many of these gaps are opportunities we can work on and plug, as we go along using it.

As you can see from the examples and screenshots on the repo, it is gathering a lot of information about the text data so going through large sets of data means a lot more waiting.

If you can please open a separate issue about each such thing you find that would be great.

neomatrix369 commented 4 years ago

I have opened two issues #2 and #3 that's covering the new requests, feel free to work on them if you like to, in case you beat me to it. ;)

neomatrix369 commented 4 years ago

Thanks for your feedback, I have changed the title of this issue wrt Python 3.6 support. Glad you could get it to work on your version of Python.

As per anything else can you please open new issues and screenshots and examples will help.

I will try to see what I can do to resolve these issues.

neomatrix369 commented 4 years ago

@strivedi02 did you manage to install and successfully run it on Python 3.6 - any screenshots to share?

strivedi02 commented 4 years ago

Yes, it is running but those spelling related line were taking time so am not running whole code only a few lines at first and so far it's happening. But to be sure completely I will have to check the whole thing.

neomatrix369 commented 4 years ago

Yes, it is running but those spelling related line were taking time so am not running whole code only a few lines at first and so far it's happening. But to be sure completely I will have to check the whole thing.

The performance bottleneck is not related to Python version compatibility. But I'll await till you have a full run of it. But you can run the notebook in the repo - see under notebooks and that should give you the picture.

I should have tests that would just resolve the checking issues.

neomatrix369 commented 4 years ago

Manually test on Python 3.6 in a docker container via notebook and also on Google colab using notebook:

Python 3.6.9
Collecting git+https://github.com/neomatrix369/nlp_profiler.git@master
  Cloning https://github.com/neomatrix369/nlp_profiler.git (to revision master) to /tmp/pip-req-build-46c7lk8_
Requirement already satisfied (use --upgrade to upgrade): nlp-profiler==0.0.1 from git+https://github.com/neomatrix369/nlp_profiler.git@master in /usr/local/lib/python3.6/dist-packages/nlp_profiler-0.0.1-py3.6.egg
Requirement already satisfied: textblob>=0.15.3 in /usr/local/lib/python3.6/dist-packages (from nlp-profiler==0.0.1) (0.15.3)
Requirement already satisfied: nltk>=3.5 in /usr/local/lib/python3.6/dist-packages (from nlp-profiler==0.0.1) (3.5)
Requirement already satisfied: language_tool_python>=2.3.1 in /usr/local/lib/python3.6/dist-packages/language_tool_python-2.4.2-py3.6.egg (from nlp-profiler==0.0.1) (2.4.2)
Requirement already satisfied: requests>=2.23.0 in /usr/local/lib/python3.6/dist-packages (from nlp-profiler==0.0.1) (2.23.0)
Requirement already satisfied: emoji>=0.5.4 in /usr/local/lib/python3.6/dist-packages/emoji-0.6.0-py3.6.egg (from nlp-profiler==0.0.1) (0.6.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (from nltk>=3.5->nlp-profiler==0.0.1) (4.41.1)
Requirement already satisfied: click in /usr/local/lib/python3.6/dist-packages (from nltk>=3.5->nlp-profiler==0.0.1) (7.1.2)
Requirement already satisfied: joblib in /usr/local/lib/python3.6/dist-packages (from nltk>=3.5->nlp-profiler==0.0.1) (0.16.0)
Requirement already satisfied: regex in /usr/local/lib/python3.6/dist-packages (from nltk>=3.5->nlp-profiler==0.0.1) (2019.12.20)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests>=2.23.0->nlp-profiler==0.0.1) (2020.6.20)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests>=2.23.0->nlp-profiler==0.0.1) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests>=2.23.0->nlp-profiler==0.0.1) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests>=2.23.0->nlp-profiler==0.0.1) (3.0.4)
Building wheels for collected packages: nlp-profiler
  Building wheel for nlp-profiler (setup.py): started
  Building wheel for nlp-profiler (setup.py): finished with status 'done'
  Created wheel for nlp-profiler: filename=nlp_profiler-0.0.1-cp36-none-any.whl size=7235 sha256=dfd56ff18f729140ee4c10a46bea96597dd31e362cbcb9c1ad0a66a9e0ce7716
  Stored in directory: /tmp/pip-ephem-wheel-cache-aqb6ase8/wheels/d8/ae/ef/08bcfe7be09b90889b6f53bcef0a2f40eb63d2e954dcb20966
Successfully built nlp-profiler
Once successfully installed, please RESTART your Jupyter kernels or Colab runtimes for the changes to take effect
  Running command git clone -q https://github.com/neomatrix369/nlp_profiler.git /tmp/pip-req-build-46c7lk8_

/cc @strivedi02 please check on your end and confirm if this works - you don;t need to run it against large datasets to confirm, any small dataset should be good enough or a small subset of a large dataset as well

strivedi02 commented 4 years ago

@neomatrix369 ok, I'll test it and let you know by tomorrow. Sorry I was running a little bit busy.