piskvorky / smart_open

Utils for streaming large files (S3, HDFS, gzip, bz2...)
MIT License
3.2k stars 383 forks source link

cannot import name 'open' from 'smart_open' #489

Closed littleyee closed 4 years ago

littleyee commented 4 years ago

I am receiving the error File "C:\ProgramData\Anaconda2\lib\site-packages\gensim\utils.py", line 45, in from smart_open import open

ImportError: cannot import name open I am using python 2.7.16, the gensim is in 3.8.2 and smart-open is 1.10.1. Any ideas of what is going on?

mpenkov commented 4 years ago

What is the output of:

import smart_open
dir(smart_open)

Here is the output I see on my system (different to yours) as an example:

$ python
Python 3.6.5 (default, Apr  1 2018, 05:46:30) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import smart_open
>>> dir(smart_open)
['__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'bytebuffer', 'compression', 'concurrency', 'constants', 'doctools', 'gcs', 'hdfs', 'http', 'local_file', 'logger', 'logging', 'open', 'parse_uri', 'register_compressor', 's3', 's3_iter_bucket', 'smart_open', 'smart_open_lib', 'ssh', 'transport', 'utils', 'version', 'webhdfs']
>>> 
littleyee commented 4 years ago

The output is

Python 2.7.16 (v2.7.16:413a49145e, Mar  4 2019, 01:37:19) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import smart_open
>>> dir(smart_open)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__', 'bytebuffer', 'doctools', 'hdfs', 'http', 'logger', 'logging', 'open', 'register_compressor', 's3', 's3_iter_bucket', 'smart_open', 'smart_open_lib', 'ssh', 'version', 'webhdfs']
>>>
mpenkov commented 4 years ago

Interesting. The open function is in the list of functions offered by the smart_open package. I'm not sure why it's not importing.

I originally suspected this is some kind of Python 2 issue, but it works in a Py2 environment here without any problems:

$ python
Python 2.7.17 (default, Apr 15 2020, 17:20:14) 
[GCC 7.5.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from smart_open import open
>>> import smart_open
>>> smart_open.__version__
'1.10.1'
>>>
Abhishek-Prajapat commented 4 years ago

I am not getting the open function in the smart_open directory. `import smart_open

dir(smart_open)

['BZ2File', 'BytesIO', 'DEFAULT_ERRORS', 'IS_PY2', 'P', 'PATHLIB_SUPPORT', 'SSLError', 'SYSTEM_ENCODING', 'Uri', 'builtins', 'cached', 'doc', 'file', 'loader', 'name', 'package', 'path', 'spec', 'boto', 'codecs', 'collections', 'gzip', 'hdfs', 'http', 'importlib', 'io', 'logger', 'logging', 'os', 'pathlib', 'pathlib_module', 'requests', 's3', 's3_iter_bucket', 'six', 'smart_open', 'smart_open_hdfs', 'smart_open_http', 'smart_open_lib', 'smart_open_s3', 'smart_open_webhdfs', 'sys', 'urlparse', 'urlsplit', 'warnings', 'webhdfs']`

mpenkov commented 4 years ago

@Abhishek-Prajapat What smart_open version are you using?

anantguptadbl commented 4 years ago

@Abhishek-Prajapat , I am also not able to see any open in the smart_open. I am, using the following versions _smartopen-1.8.0 gensim-3.8.3

The issue is that < 1.8.2 , the def open was actually def smart_open You can install the latest version of smart_open and that will solve the problem. If you have a constraint on downloading the latest and want to solve this specific problem quickly, you can replicate the functions on your side

I solved it by writing my own version of glove2word2vec which is almost copy-paste

glove_input_file = 'glove.6B.50d.txt'
word2vec_output_file = 'glove.6B.50d.txt.word2vec'

def gloveToword2vec(glove_input_file,word2vec_output_file):
    with open(glove_input_file, 'rb') as f:
        num_lines = sum(1 for _ in f)
    with open(glove_input_file, 'rb') as f:
        num_dims = len(f.readline().split()) - 1
    with open(word2vec_output_file, 'wb') as fout:
        fout.write("{0} {1}\n".format(num_lines, num_dims).encode('utf-8'))
        with open(glove_input_file, 'rb') as fin:
            for line in fin:
                fout.write(line)
    print("Completed")

gloveToword2vec(glove_input_file, word2vec_output_file)
Auburngrads commented 4 years ago

This issue doesn't appear to have been fixed in the latest versions smart_open.

Both 2.0.0 and 2.1.0 state that the smart_open.smart_open() is deprecated in favor of smart_open.open() in smart_open/smart_open_lib.py although this doesn't seem to be what is occurring.

Also, the list of exported functions I see from smart_open does not include open

dir(smart_open)

['BZ2File', 'BytesIO', 'DEFAULT_ERRORS', 'IS_PY2', 'P', 'PATHLIB_SUPPORT', 'SSLError', 'SYSTEM_ENCODING', 'Uri', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'boto', 'codecs', 'collections', 'gzip', 'hdfs', 'http', 'importlib', 'io', 'logger', 'logging', 'os', 'pathlib', 'pathlib_module', 'requests', 's3', 's3_iter_bucket', 'six', 'smart_open', 'smart_open_hdfs', 'smart_open_http', 'smart_open_lib', 'smart_open_s3', 'smart_open_webhdfs', 'sys', 'urlparse', 'urlsplit', 'warnings', 'webhdfs']

My immediate issues with gensim appear to have been fixed by updating gensim/utils.py to read

from smart_open import smart_open
piskvorky commented 4 years ago

@Auburngrads are you sure you're using the latest smart_open?

My guess would be you have two (or more) instances of smart_open installed, perhaps in different virtual environments. And one of them is old (smart_open.smart_open) which is what the gensim environment is picking up.

Otherwise it's really weird. How did you install smart_open? Can you please uninstall it, re-install the latest version again and post the full installation log here (your installation command + all output until the installation completes)?

Auburngrads commented 4 years ago

@piskvorky possible that I would have multiple versions installed, but seems unlikely.

Not using any virtual environments that I'm aware of and only have smart_open installed to my only custom environment (not found in base). Installed using pip in anaconda prompt.

Removed and reinstalled pip install smart_open, verified version 2.1.0 is installed and have the same issue

Installation log file is attached. so_log.txt

piskvorky commented 4 years ago

Thanks. When you open the file c:\users\aubur\appdata\local\r-mini~1\envs\r-reticulate\lib\site-packages\smart_open\smart_open_lib.py, do you see the line def open( there, on line 109?

My MD5 hash of that file in release 2.1.0 is MD5 (smart_open_lib.py) = 5fcfed617811f6b63cd8ff44df118e83.

Alternatively, to check whether the same installed file is being imported, what does import smart_open; print(smart_open.__file__) say?

Auburngrads commented 4 years ago

I do, yes.

I haven’t found anything in this file that I can attribute to why open would not be available.

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10

From: Radim Řehůřekmailto:notifications@github.com Sent: Wednesday, July 29, 2020 4:06 PM To: RaRe-Technologies/smart_openmailto:smart_open@noreply.github.com Cc: Jason Freelsmailto:Auburngrads@live.com; Mentionmailto:mention@noreply.github.com Subject: Re: [RaRe-Technologies/smart_open] cannot import name 'open' from 'smart_open' (#489)

Thanks. When you open the file c:\users\aubur\appdata\local\r-mini~1\envs\r-reticulate\lib\site-packages\smart_open\smart_open_lib.py, do you see the line def open( there?

My MD5 hash of that file in release 2.1.0 is MD5 (smart_open_lib.py) = 5fcfed617811f6b63cd8ff44df118e83.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRaRe-Technologies%2Fsmart_open%2Fissues%2F489%23issuecomment-665891741&data=02%7C01%7C%7C692a44f8e90345de09a808d833fae42b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637316499937941697&sdata=zdNIzQaUyWQup6HSEm3r6bsdXAg0ZwgzWWGCx53PdVg%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACCKCBPKIRZYHHX6GOJ2BT3R6B6MTANCNFSM4MRMVQVQ&data=02%7C01%7C%7C692a44f8e90345de09a808d833fae42b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637316499937941697&sdata=m2VLvI3fAsSCsYEiokYT12DJkgyHn4BhBsKLRodOgfY%3D&reserved=0.

piskvorky commented 4 years ago

What does import smart_open; print(smart_open.__file__) say?

Auburngrads commented 4 years ago

It returns C:\Users\Aubur\AppData\Local\r-miniconda\envs\r-reticulate\lib\site-packages\smart_open__init__.py

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10

From: Radim Řehůřekmailto:notifications@github.com Sent: Wednesday, July 29, 2020 5:33 PM To: RaRe-Technologies/smart_openmailto:smart_open@noreply.github.com Cc: Jason Freelsmailto:Auburngrads@live.com; Mentionmailto:mention@noreply.github.com Subject: Re: [RaRe-Technologies/smart_open] cannot import name 'open' from 'smart_open' (#489)

What does import smart_open; print(smart_open.file) say?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRaRe-Technologies%2Fsmart_open%2Fissues%2F489%23issuecomment-665940398&data=02%7C01%7C%7Ca65fc14816654deb5b5d08d834071001%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637316552215533631&sdata=4s2hvjX8Bf10XF%2FokdUgs1JWIgy7HXvdlhGrTnmZOkw%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACCKCBI3GRSXDHLL5R3OCOLR6CITJANCNFSM4MRMVQVQ&data=02%7C01%7C%7Ca65fc14816654deb5b5d08d834071001%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637316552215543627&sdata=xXjGwl4bFnClvBhHcJ09ge4YhiaWU92wbmADuV24%2BFM%3D&reserved=0.

piskvorky commented 4 years ago

What's up with the r-mini~1 vs r-miniconda? Do both paths exist, or why are the log paths truncated like that?

Auburngrads commented 4 years ago

The paths are the same, although anaconda prompt is truncating the printed path. This did highlight what my be my issue, though.

I interact with Python via Rstudio through an R package called reticulate. This is essentially a Python REPL that uses the r-reticulate environment by default (The non-truncated path is returned by the REPL). Up until this I’ve never seen a difference between the behavior of the reticulate REPL and that when I interact with Python directly.

I ran from smart_open import open in Python from the Anaconda Prompt -- without error. When attempted to verify this in spyder I needed to fix an issue with qtpy. After uninstalling and re-installing qtpy (and dependencies) everything works, both in Python and in the REPL.

I can only assume that a library required to establish the REPL became corrupted and caused the error.

piskvorky commented 4 years ago

Great, thanks for following up. I'm closing this ticket – the issue is with conflicting Python versions as expected, not smart_open as such.

For others stumbling on this ticket in the future: please check your environment Python paths as per above. Make sure the problem is not in your local setup.