codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.06k stars 2.11k forks source link

pyinstaller executable not downloading the articles, only title #528

Closed Yuntaz closed 6 years ago

Yuntaz commented 6 years ago

I am building an excecutable on Linux with pyinstaller. We don't have any error. But in runtime, it does not download the article. we don`t have any errors, just donwload an empty article. Here is the

sudo yum -y install epel-release
sudo yum -y install gcc libffi-devel python34-devel openssl-devel
sudo yum -y install python34-setuptools
sudo easy_install pip
pip3 install newspaper3k
pip3 install pyinstaller
pip3 install cx_freeze

Here is the test.py import newspaper import sys import io

sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf8') m1 = newspaper.build('https://www.infobae.com', memoize_articles=False, verbose=True) print(m1.articles) for article in m1.articles: article.download() article.parse() print('====================================================================================================================================') print(article.url) print(article.title) print(article.text) print('====================================================================================================================================')

If I run python3 test.py, everything is fine. However, if I run pyinstaller test.py, using the executable we have this rare behavior.

Any ideas?

Thanks!

Yuntaz commented 6 years ago

Resolved! You need to add the whole folder of newspaper with all the resources.

If you develop a SPEC file for Pyinstaller, be sure to add

include mydir in distribution

def extra_datas(mydir): def rec_glob(p, files): import os import glob for d in glob.glob(p): if os.path.isfile(d): files.append(d) rec_glob("%s/" % d, files) files = [] rec_glob("%s/" % mydir, files) extra_datas = [] for f in files: extra_datas.append((f, os.path.dirname(f)))

return extra_datas

###########################################

datas = extra_datas('newspaper')