Failure to download html + problem xblocks

openzim / openedx

Open edX (to zim) scraper

GNU General Public License v3.0

8 stars 7 forks source link

Failure to download html + problem xblocks #175

Open benoit74 opened 1 year ago

benoit74 commented 1 year ago

While testing the creation of a ZIM with https://openlearninglibrary.mit.edu/courses/course-v1:OCW+6.042J+2T2019/course/, I encountered many download errors.

It looks like no html and no problem xblocks are downloaded (might need to implement #160 for more details :)).

From what I see, in each case the scrapper tries to retrieve the student_view_url but openlearninglibrary.mit.edu server always return a 500 on this URL.

benoit74 commented 1 year ago

Same issue with another platform (FUN) and another course : https://lms.fun-mooc.fr/courses/course-v1:MinesTelecom+04012+session08/courseware/ba47011e0c9841da83b7fe4836cc6f40/

Looks like the scraper is broken.

benoit74 commented 1 year ago

For FUN, it looks like removing the update_csrf_token_in_headers here solves the issue. But it does not work for MIT. So definitely linked to the way we are trying to mimic the UI behavior via headers / cookies to retrieve contents which has changed from one OpenEDX release to the next one.

rgaudin commented 1 year ago

I don't get whether this is a general problem (and we can't run any openedx recipe) or if this is a discrete problem that happens with both FUN and MIT (and maybe others)

benoit74 commented 1 year ago

From my perspective, it mostly looks like a general problem. Most probably linked to some change in openedx, so maybe some old instances might still work.

I can't tell for sure how wide the impact is for us, since there is no more openedx recipes active on the zimfarm, now that we have deactived PHZH recipes.

Do you have some openedx course you would like to test?

joe-rabbit commented 10 months ago

Hello, should the user be logged and enrolled only then it would downloaded???

benoit74 commented 10 months ago

AFAIK (I'm not an expert of this scraper), user should be enrolled in the course you want to download, yes.