An offliner to create ZIM :package: files from openedx powered courses
:zap: Scraper is known to have a very significant issue on recent openEdx version, we are looking for resources / support to work on this issue (https://github.com/openzim/openedx/issues/175) |
---|
Openedx is one of the most popular open source MOOC platforms which revolves around the idea of xblocks. It makes e-learning more accessible by providing an easy way to create courses for teachers, universities and others. It is used by many e-learning services as such as edX as a tool to create, organize and manage MOOCs quite easily.
This project is aimed at creating a tool to make openedx based MOOCs more accessible by creating ZIM files providing the same course materials and resources offline.
Make sure that you have python3
, unzip
, ffmpeg
, wget
, jpegoptim
, gifsicle
, pngquant
, advdef
, and curl
installed on your system before running the scraper (otherwise you'll get a warning to install them).
You must be enrolled into the mooc you want to offline. Ensure that you do not open the openedx instance in the browser with the same account while the scraper runs. Also, this scraper must be used only with a MOOC with a free license.
One can eaisly install the PyPI version but let's setup the source version. Firstly, clone this repository and install the package as given below.
pip3 install -r requirements.txt
python3 setup.py install
That's it. You can now run openedx2zim
from your terminal
openedx2zim --course-url [URL] --email [EMAIL] --name [NAME]
For the full list of arguments, see this file or run the following
openedx2zim --help
Example usage
openedx2zim --course-url="https://openlearninglibrary.mit.edu/courses/course-v1:OCW+6.042J+2T2019/course/" --publisher="Massachusetts Institute of Technology" --email="example@example.com" --name="sample" --tmp-dir="output" --output="output" --debug --keep --format="mp4"
This project can also be run with docker. Use the provided Dockerfile to run it with docker. See steps here.
You can create ZIMs for MOOCs powered by the openedx platform (find a list of openedx powered instances here), choose between different video formats (webm/mp4), different compression rates, and even use an S3 based cache.
The answers can be extracted only for "multiple choice question" type problems with single answer correct and multiple answer correct (only if the number of options in that case is at most 5). This is due to large number of requests required to extract answers for other types of answers. For more information, refer here.