PaulMcInnis / JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.
MIT License
1.78k stars 210 forks source link

Starts and then fails? #144

Closed Pittsie72 closed 2 years ago

Pittsie72 commented 3 years ago

I try running the attached two files in an anaconda prompt on my windows 10 PC. When I run I get this error: funnel load -s Settings.yaml [2021-05-12 14:24:47,379] [INFO] JobFunnel: Scraping local providers with: ['IndeedScraperUSAEng', 'MonsterScraperUSAEng'] [2021-05-12 14:24:48,647] [INFO] IndeedScraperUSAEng: Found 13 pages of search results for query=HR+Human Resource+Human Resources [2021-05-12 14:24:51,164] [INFO] IndeedScraperUSAEng: Scraped 0 job listings from search results pages [2021-05-12 14:24:51,165] [ERROR] JobFunnel: Failed to scrape jobs for IndeedScraperUSAEng [2021-05-12 14:24:51,174] [INFO] MonsterScraperUSAEng: No get() or set() will be done for Job attrs: ['REMOTENESS'] [2021-05-12 14:24:51,730] [ERROR] JobFunnel: Failed to scrape jobs for MonsterScraperUSAEng [2021-05-12 14:24:51,731] [INFO] JobFunnel: Completed all scraping, found 0 new jobs. [2021-05-12 14:24:51,737] [WARNING] JobFunnel: No new jobs were added to CSV.

Notice it start then stop. I have this on the basic branch not sure if I should try a different one or not Steel.txt Bill.txt

thebigG commented 3 years ago

Yes, unfortunately. I'm currently working on a fix on here b804ff5620c6b2ed08ffd52bac5b240427716a89, but I'm very busy at the moment. So no promises. But hopefully I'll get this fixed soon :).

thebigG commented 3 years ago

What's happening is that all websites are being loaded dynamically and we'll have to switch to using selenium at some point. Turns out it does have a headless option, which is awesome because now we'll be able to test it on CI. If you look at my code, you'll notice I'm passing options to the firefox driver, and one of those options is a headless flag :).

PabloJT commented 3 years ago

I have the same problem, I hope it is solved soon! :)

[2021-05-25 17:59:01,067] [INFO] JobFunnel: Scraping local providers with: ['IndeedScraperFRFre'] [2021-05-25 17:59:02,176] [ERROR] JobFunnel: Failed to scrape jobs for IndeedScraperFRFre [2021-05-25 17:59:02,176] [INFO] JobFunnel: Completed all scraping, found 0 new jobs. [2021-05-25 17:59:02,231] [WARNING] JobFunnel: No new jobs were added to CSV.

PaulMcInnis commented 2 years ago

Echoing @thebigG I also would like to see this improved, I think it will be some work to get this project up and running again, though a number of people still star this every week, so it feels warranted.

Ideally we avoid re-writing too much as there is a fair bit of code currently.

PaulMcInnis commented 2 years ago

added a notice to project README directing people to the discussion.

PaulMcInnis commented 2 years ago

noting here that discussion is in #148