NightMachinery / coursera-dl

Download courses from Coursera
GNU Lesser General Public License v3.0
5 stars 3 forks source link

[SOLVED] 403 Error persists #2

Closed zenny closed 3 years ago

zenny commented 3 years ago

Subject of the issue

Getting 403 issues after the upstream change as discussed in https://github.com/coursera-dl/coursera-dl/issues/800

Your environment

Steps to reproduce

  1. Pull this repo.
  2. create a virtaulenv
  3. install requirments
  4. Run the script: ./coursera-dl --resume --path=../courses research-inquiry-discovery

Expected behaviour

Should download the course

Actual behaviour

$ ./coursera-dl --resume --path=../courses research-inquiry-discovery
/home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py:948: SyntaxWarning: "is" with a literal. Did you
mean "=="?
  if extension is '':
/home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py:1593: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if extension is '':
coursera_dl version 0.11.5
Downloading class: research-inquiry-discovery (1 / 1)
Parsing syllabus of on-demand course (id=IAGMLXkjEequHgrfnjtqcQ). This may take some time, please be patient ...
Error 403 Client Error: Forbidden for url: https://api.coursera.org/api/memberships.v1?incudes=courseId,courses.v1&q=me&showHidden=true&filter=current,preEnrolled getting page https://api.coursera.org/api/memberships.v1?includes=courseId,courses.v1&q=me&showHidden=true&filter=current,preEnrolled
The server replied: {"errorCode":"Not Authorized","message":null,"details":null}
zenny commented 3 years ago

UPDATE: I logged out and logged in again with a new cookies in coursera-dl.conf with a different error now.

$ ./coursera-dl --resume --path=../courses research-inquiry-discovery
coursera_dl version 0.11.5
Downloading class: research-inquiry-discovery (1 / 1)
Parsing syllabus of on-demand course (id=IAGMLXkjEequHgrfnjtqcQ). This may take some time, please be patient ...
Processing module  the-process-of-inquiry
Processing section     welcome-and-additional-resources
Processing lecture         welcome-to-the-course (lecture)
Processing lecture         course-syllabus (supplement)
Traceback (most recent call last):
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/./coursera-dl", line 6, in <module>
    coursera_dl.main()
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/coursera_dl.py", line 249, in main
    error_occurred, completed = download_class(
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/coursera_dl.py", line 214, in download_class
    return download_on_demand_class(session, args, class_name)
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/coursera_dl.py", line 134, in download_on_demand_class
    error_occurred, modules = extractor.get_modules(
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/extractors.py", line 53, in get_modules
    error_occurred, modules = self._parse_on_demand_syllabus(
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/extractors.py", line 161, in _parse_on_demand_syllabus
    links = course.extract_links_from_supplement(
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py", line 1268, in extract_links_from_supplement
    supplement_content, self._extract_links_from_text(value))
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py", line 1522, in _extract_links_from_text
    self._extract_links_from_asset_tags_in_text(text))
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py", line 1550, in _extract_links_from_asset_tags_in_text
    title = clean_filename(
  File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/utils.py", line 118, in clean_filename
    s = h.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'

Thanks.

zenny commented 3 years ago

HTMLParser' object has no attribute 'unescape'

Applying this patch (https://github.com/coursera-dl/edx-dl/commit/5490a99a98b56f544661c131229ef640ace2b064) to utils.py worked. Thanks!

zenny commented 3 years ago

But it only works for a single course download, if in batch the 403 error persists:

Error 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true getting page https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true
The server replied: {"errorCode":"Not Authorized","message":null,"details":null}
HTTPError 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true
Sleeping for 60 seconds before downloading next course. You can change this with --download-delay option.
zenny commented 3 years ago

But it only works for a single course download, if in batch the 403 error persists:

Error 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true getting page https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true
The server replied: {"errorCode":"Not Authorized","message":null,"details":null}
HTTPError 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true
Sleeping for 60 seconds before downloading next course. You can change this with --download-delay option.

Solved after increasing --download-delay to 120s