codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.06k stars 2.11k forks source link

Cannot import name 'Article' #535

Open RingWong opened 6 years ago

RingWong commented 6 years ago

Environment: Ubuntu 16.04 Anaconda 4.4.0 python 3.6.1 newspaper3k 0.2.6

Description: Hi~ When I run the code "from newspaper import Article", an error has occurred:

ImportError: cannot import name 'Article'

How can I do to fix it? Thanks!

codelucas commented 6 years ago

I think there may be a problem with if you are in the same python environment as where you installed newspaper.

Please post details of how you installed newspaper and also how you initiated your python environment (did you use a virtualenv?)

Please also paste the entire installation output

priyaananthasankar commented 6 years ago

Yes it happens with this for me too: virtualenv venv source venv/bin/activate pip install newspaper3k

then in python prompt : from newspaper import Article.

Fails with

from newspaper import Article ImportError: cannot import name 'Article'

benbenbuhben commented 6 years ago

I'm having this same issue. Any update?

codelucas commented 6 years ago

Thank you for filing and reporting these issues @RingWong @priyaananthasankar @benbenbuhben

I am having trouble reproducing these commands, however we've seen import errors in the past when users mistakenly try to install using pip2 on newspaper3k (python3) or vice versa.

Can you confirm your pip is pointing at pip3 in your venv? See this issue https://github.com/codelucas/newspaper/issues/182?

My repro attempt did not give the same results:

$ virtualenv test-newspaper-env 
$ cd test-newspaper-env    
$ source bin/activate  

$ which pip3
  /Users/lucasou-yang/workspace/test-newspaper-env/bin/pip3

$ pip3 install newspaper3k

$ python (make sure it's python3)
from newspaper import Article
sitting-duck commented 4 years ago

Hi @codelucas, I can reproduce this issue.

Ubuntu Version Info: Distributor ID: Ubuntu Description: Ubuntu 18.04.3 LTS Release: 18.04 Codename: bionic

I noticed you asked earlier some user to paste their installation output, but got no pasted output from them. I can paste here my installation output:

$ pip3 install newspaper3k
Collecting newspaper3k
  Using cached https://files.pythonhosted.org/packages/d7/b9/51afecb35bb61b188a4b44868001de348a0e8134b4dfa00ffc191567c4b9/newspaper3k-0.2.8-py3-none-any.whl
Collecting jieba3k>=0.35.1 (from newspaper3k)
Collecting Pillow>=3.3.0 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/19/5e/23dcc0ce3cc2abe92efd3cd61d764bee6ccdf1b667a1fb566f45dc249953/Pillow-7.0.0-cp36-cp36m-manylinux1_x86_64.whl
Collecting cssselect>=0.9.2 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/3b/d4/3b5c17f00cce85b9a1e6f91096e1cc8e8ede2e1be8e96b87ce1ed09e92c5/cssselect-1.1.0-py2.py3-none-any.whl
Collecting nltk>=3.2.1 (from newspaper3k)
Collecting requests>=2.10.0 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/1a/70/1935c770cb3be6e3a8b78ced23d7e0f3b187f5cbfab4749523ed65d7c9b1/requests-2.23.0-py2.py3-none-any.whl
Collecting feedfinder2>=0.0.4 (from newspaper3k)
Collecting PyYAML>=3.11 (from newspaper3k)
Collecting beautifulsoup4>=4.4.1 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/cb/a1/c698cf319e9cfed6b17376281bd0efc6bfc8465698f54170ef60a485ab5d/beautifulsoup4-4.8.2-py3-none-any.whl
Collecting tinysegmenter==0.3 (from newspaper3k)
Collecting python-dateutil>=2.5.3 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl
Collecting lxml>=3.6.0 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/dd/ba/a0e6866057fc0bbd17192925c1d63a3b85cf522965de9bc02364d08e5b84/lxml-4.5.0-cp36-cp36m-manylinux1_x86_64.whl
Collecting tldextract>=2.0.1 (from newspaper3k)
  Using cached https://files.pythonhosted.org/packages/fd/0e/9ab599d6e78f0340bb1d1e28ddeacb38c8bb7f91a1b0eae9a24e9603782f/tldextract-2.2.2-py2.py3-none-any.whl
Collecting feedparser>=5.2.1 (from newspaper3k)
Collecting six (from nltk>=3.2.1->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/65/eb/1f97cb97bfc2390a276969c6fae16075da282f5058082d4cb10c6c5c1dba/six-1.14.0-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests>=2.10.0->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/b9/63/df50cac98ea0d5b006c55a399c3bf1db9da7b5a24de7890bc9cfd5dd9e99/certifi-2019.11.28-py2.py3-none-any.whl
Collecting idna<3,>=2.5 (from requests>=2.10.0->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/89/e3/afebe61c546d18fb1709a61bee788254b40e736cff7271c7de5de2dc4128/idna-2.9-py2.py3-none-any.whl
Collecting chardet<4,>=3.0.2 (from requests>=2.10.0->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests>=2.10.0->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/e8/74/6e4f91745020f967d09332bb2b8b9b10090957334692eb88ea4afe91b77f/urllib3-1.25.8-py2.py3-none-any.whl
Collecting soupsieve>=1.2 (from beautifulsoup4>=4.4.1->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/05/cf/ea245e52f55823f19992447b008bcbb7f78efc5960d77f6c34b5b45b36dd/soupsieve-2.0-py2.py3-none-any.whl
Collecting requests-file>=1.4 (from tldextract>=2.0.1->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/23/9c/6e63c23c39e53d3df41c77a3d05a49a42c4e1383a6d2a5e3233161b89dbf/requests_file-1.4.3-py2.py3-none-any.whl
Collecting setuptools (from tldextract>=2.0.1->newspaper3k)
  Using cached https://files.pythonhosted.org/packages/1e/32/69e6ce8502a12af0cdcbb8de91f1cbafd4533f84d8fe17bf712f6200f21b/setuptools-46.1.1-py3-none-any.whl
Installing collected packages: jieba3k, Pillow, cssselect, six, nltk, certifi, idna, chardet, urllib3, requests, soupsieve, beautifulsoup4, feedfinder2, PyYAML, tinysegmenter, python-dateutil, lxml, requests-file, setuptools, tldextract, feedparser, newspaper3k
Successfully installed Pillow-7.0.0 PyYAML-5.3.1 beautifulsoup4-4.8.2 certifi-2019.11.28 chardet-3.0.4 cssselect-1.1.0 feedfinder2-0.0.4 feedparser-5.2.1 idna-2.9 jieba3k-0.35.1 lxml-4.5.0 newspaper3k-0.2.8 nltk-3.4.5 python-dateutil-2.8.1 requests-2.23.0 requests-file-1.4.3 setuptools-46.1.1 six-1.14.0 soupsieve-2.0 tinysegmenter-0.3 tldextract-2.2.2 urllib3-1.25.8

You can see at the beginning of the output I explicitly call pip3 (to avoid the pip/pip3 error)

I can reproduce the error using this small python script:

from newspaper import Article
url = 'https://www.huffpost.com/entry/weird-fathers-day-gifts-2018_n_5b05bf18e4b05f0fc84438f6'
article = Article(url)

I tried changing the url, to see if some other sort of exception was causing perhaps a misleading error message, but also to no avail.

I will show here that I call which python3 to show I have python, then python3 simple.py to reproduce the error.

$ which python3
/usr/bin/python3

$ python3 simple.py 
Traceback (most recent call last):
  File "simple.py", line 1, in <module>
    from newspaper import Article
  File "/home/sitting_duck/projects/lovey_scrapey3/scripts/newspaper.py", line 2, in <module>
    from newspaper import Article
ImportError: cannot import name 'Article'
sitting_duck@sitting-duck-desktop:~/projects/lovey_scrapey3/scripts$ 

I have an active TeamViewer account, perhaps we can collaborate via screenshare, if you are interested you can test and reproduce this bug on my machine and hopefully we can come up with a fix. I figure something about my configuration must be different from yours, so I think a TeamViewer session would help you get to the bottom of it. If you are interested you can email me at ashley.tharp@gmail.com

I work 9am to 5pm Central US time, but I will check my email and try to get back to you as soon as possible.

For now I will attempt to complete my article scraping project some other way, but I hope you do have some time to work together and some interest in collaborating, and we can figure out what is causing this error/bug.

Thanks.

eluisluzquadros commented 4 years ago

Hi I had this problem. I used https://anaconda.org/conda-forge/newspaper3k and some packages have updated. Solved here