Open EllieLockhart opened 4 years ago
Update on this: I used two example scripts by other people that did the same thing, but with different news sources, and they also hang at news_pool.join(). I'm trying to rebuild my entire local Python environment in case this is an issue of newspaper3k needing an older version of Python, but I'm not hopeful since cloud compilers failed too. The traceback was essentially the same with other peoples' scripts.
I'm working on a project to extract articles from gaming media sites, and I'm doing a basic test run, which according to VSCode's debugger consistently hangs at the point after which I've set up a multi-threaded extraction (changing the number of threads does not help) on two sites. I'm honestly not sure what I'm doing wrong here; I followed the examples that have been laid out. One of the sites, Gamespot, is even used in someone's tutorial, and I tried removing the other (Polygon) and it doesn't seem to help. I've created a virtual environment and tried this with both Python 3.8 and 3.7. All dependencies appear to be satisfied; I also tested in in repl dot it and got the same hang.
I would love to hear I'm just doing something wrong so I can fix it; I really want to do some data science on these specific websites and their articles! But it seems as if, at least for an OS X user, there's some sort of bug with multithreading. Here's my code:
and here's what I get back when I finally give up and hit the interrupt at console: