Closed zenny closed 3 years ago
UPDATE: I logged out and logged in again with a new cookies in coursera-dl.conf
with a different error now.
$ ./coursera-dl --resume --path=../courses research-inquiry-discovery
coursera_dl version 0.11.5
Downloading class: research-inquiry-discovery (1 / 1)
Parsing syllabus of on-demand course (id=IAGMLXkjEequHgrfnjtqcQ). This may take some time, please be patient ...
Processing module the-process-of-inquiry
Processing section welcome-and-additional-resources
Processing lecture welcome-to-the-course (lecture)
Processing lecture course-syllabus (supplement)
Traceback (most recent call last):
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/./coursera-dl", line 6, in <module>
coursera_dl.main()
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/coursera_dl.py", line 249, in main
error_occurred, completed = download_class(
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/coursera_dl.py", line 214, in download_class
return download_on_demand_class(session, args, class_name)
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/coursera_dl.py", line 134, in download_on_demand_class
error_occurred, modules = extractor.get_modules(
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/extractors.py", line 53, in get_modules
error_occurred, modules = self._parse_on_demand_syllabus(
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/extractors.py", line 161, in _parse_on_demand_syllabus
links = course.extract_links_from_supplement(
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py", line 1268, in extract_links_from_supplement
supplement_content, self._extract_links_from_text(value))
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py", line 1522, in _extract_links_from_text
self._extract_links_from_asset_tags_in_text(text))
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/api.py", line 1550, in _extract_links_from_asset_tags_in_text
title = clean_filename(
File "home/zenny/Downloads/Education/coursera/my-coursera/NightMachinary-coursera-dl/coursera/utils.py", line 118, in clean_filename
s = h.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'
Thanks.
HTMLParser' object has no attribute 'unescape'
Applying this patch (https://github.com/coursera-dl/edx-dl/commit/5490a99a98b56f544661c131229ef640ace2b064) to utils.py
worked. Thanks!
But it only works for a single course download, if in batch the 403 error persists:
Error 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true getting page https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true
The server replied: {"errorCode":"Not Authorized","message":null,"details":null}
HTTPError 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true
Sleeping for 60 seconds before downloading next course. You can change this with --download-delay option.
But it only works for a single course download, if in batch the 403 error persists:
Error 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true getting page https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true The server replied: {"errorCode":"Not Authorized","message":null,"details":null} HTTPError 403 Client Error: Forbidden for url: https://api.coursera.org/api/onDemandCourseMaterials.v2/?q=slug&slug=scholoarly-communication&includes=modules%2Clessons%2CpassableItemGroups%2CpassableItemGroupChoices%2CpassableLessonElements%2Citems%2Ctracks%2CgradePolicy&&fields=moduleIds%2ConDemandCourseMaterialModules.v1(name%2Cslug%2Cdescription%2CtimeCommitment%2ClessonIds%2Coptional%2ClearningObjectives)%2ConDemandCourseMaterialLessons.v1(name%2Cslug%2CtimeCommitment%2CelementIds%2Coptional%2CtrackId)%2ConDemandCourseMaterialPassableItemGroups.v1(requiredPassedCount%2CpassableItemGroupChoiceIds%2CtrackId)%2ConDemandCourseMaterialPassableItemGroupChoices.v1(name%2Cdescription%2CitemIds)%2ConDemandCourseMaterialPassableLessonElements.v1(gradingWeight%2CisRequiredForPassing)%2ConDemandCourseMaterialItems.v2(name%2Cslug%2CtimeCommitment%2CcontentSummary%2CisLocked%2ClockableByItem%2CitemLockedReasonCode%2CtrackId%2ClockedStatus%2CitemLockSummary)%2ConDemandCourseMaterialTracks.v1(passablesCount)&showLockedItems=true Sleeping for 60 seconds before downloading next course. You can change this with --download-delay option.
Solved after increasing --download-delay
to 120s
Subject of the issue
Getting 403 issues after the upstream change as discussed in https://github.com/coursera-dl/coursera-dl/issues/800
Your environment
Steps to reproduce
./coursera-dl --resume --path=../courses research-inquiry-discovery
Expected behaviour
Should download the course
Actual behaviour