CredentialEngine / ai-course-crawler

Apache License 2.0
1 stars 0 forks source link

Successful extraction, no courses (CourseDog) #50

Closed rvilsack closed 2 months ago

rvilsack commented 2 months ago

AI Course Crawler Extract link: https://master.ai-course-crawler.development.c66.me/extractions/33

The URL used for this configuration appeared to be valid: https://catalog.stcloudstate.edu/courses

And the configuration correctly identified the number of pages I expected:

COURSE_LINKS_PAGE (has pagination) > COURSE_DETAIL_PAGE (no pagination) { "pageType": "COURSE_LINKS_PAGE", "linkRegexp": "\/courses\/\d+", "pagination": { "urlPatternType": "page_num", "urlPattern": "https://catalog.stcloudstate.edu/courses?page={page_num}", "totalPages": 269 }, "links": { "pageType": "COURSE_DETAIL_PAGE" } }

rsaksida commented 2 months ago

It's the same problem as #46. Right now, CourseDog is not supported as the course data is loaded via API requests. We can add special support for CourseDog if that's important for the project. I created #51 to center the discussion about CourseDog.