deanmalmgren / textract

extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k stars 599 forks source link

Update to six 1.16.0 #414

Open andyhasit opened 2 years ago

andyhasit commented 2 years ago

Several google api libraries rely on https://github.com/googleapis/python-api-core, which require six >=1.13.0, which means textract cannot be installed alongside those.

andyhasit commented 2 years ago

I see this will be resolved by https://github.com/deanmalmgren/textract/pull/415 so will close once that's in, assuming that goes ahead :-)

KamarajuKusumanchi commented 1 year ago

@andyhasit , I am getting the following error trying to install textract from your repo. It used to work before. Could you please take a look and fix accordingly?

Pip subprocess output:
Collecting git+https://github.com/andyhasit/textract (from -r C:\Users\raju\work\github\rutils\python3\envs\condaenv.r5ag5jl_.requirements.txt (line 2))
  Cloning https://github.com/andyhasit/textract to c:\users\raju\appdata\local\temp\pip-req-build-f11_hgjf
  Resolved https://github.com/andyhasit/textract to commit 102a58418283fbc833ae1d6dad84e741e09eff66
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'error'

Pip subprocess error:
  Running command git clone --filter=blob:none --quiet https://github.com/andyhasit/textract 'C:\Users\raju\AppData\Local\Temp\pip-req-build-f11_hgjf'
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [3 lines of output]
      error in textract setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; .* suffix can only be used with `==` or

`!=` operators
          extract-msg<=0.29.*
                     ~~~~~~~^
      [end of output]
KyleKing commented 1 year ago

@KamarajuKusumanchi, the problem is the asterisk dependency: ".* suffix can only be used with ==" which is a problem in the main branch that needs to be resolved separately:

https://github.com/deanmalmgren/textract/blob/102a58418283fbc833ae1d6dad84e741e09eff66/requirements/python#L8

FWIW, I have a fork that I've published to pypi to side-step the dependency problems (https://github.com/KyleKing/textract-py3). From #470, it sounds like there might be a resurgence in interest though

KamarajuKusumanchi commented 1 year ago

Thanks @KyleKing . I was able to get it working by changing

extract-msg<=0.29.* #Last with python2 support to extract-msg<=0.29.6 #Last with python2 support

and six~=1.12.0 to six~=1.16.0

My changes are in https://github.com/deanmalmgren/textract/compare/master...KamarajuKusumanchi:textract:master .

FWIW, I have a fork that I've published to pypi to side-step the dependency problems (https://github.com/KyleKing/textract-py3).

Thanks.