udemyscraper timesout - Githubissues

sortedcord / udemyscraper

A Udemy Course Scraper built with bs4 and selenium, that fetches udemy course information. Get udemy course information and convert it to json, csv or xml file, without authentication!

https://pypi.org/project/udemyscraper/

GNU General Public License v3.0

22 stars 11 forks source link

udemyscraper timesout #55

Closed nuggetsnetwork closed 2 years ago

nuggetsnetwork commented 2 years ago

Describe the bug When running the sample code all I get is timeout.

To Reproduce Steps to reproduce the behavior: Run the sample code from udemyscraper import UdemyCourse

course = UdemyCourse() course.fetch_course('learn javascript') print(course.title)

Current behavior Timed out waiting for page to load or could not find a matching course

OS: MACOS

sortedcord commented 2 years ago

Duplicate issue. Please check out the pinned issue #46

nuggetsnetwork commented 2 years ago

I think the problem is the page view source doesnt show the classes that are written in the code. So both the page source and DOM are completely different. Only the DOM elements shows the classes and the search results html code but the actual page source of the search results doesnt have any html that the page is rendering. Is that what was called as scrape shield? In such a case does it mean we wont be able to use this package?

sortedcord commented 2 years ago

More or less yes. I have it in the readme as well that I am not working on the project anymore. If you want to fetch course info, you can use Udemy's API. Its wayy more straightforward and u can get ur API credentials relatively easily. I started implementing the use of Udemy's API but then abandoned it midway mainly because of lack of motivation. Its still there, so in case you want to you're always welcome.