Kolibri is an open source educational platform to distribute content to areas with little or no internet connectivity. Educational content is created and edited on Kolibri Studio, which is a platform for organizing content to import from the Kolibri applications. The purpose of this project is to create a chef, or a program that scrapes a content source and puts it into a format that can be imported into Kolibri Studio.
The Universal Library Project, sometimes called the Million Books Project, was pioneered by Jaime Carbonell, Raj Reddy, Michael Shamos, Gloriana St Clair, and Robert Thibadeau of Carnegie Mellon University. The Governments of India, China, and Egypt are helping fund this effort through scanning facilities and personnel. The Internet Archive has contributed 100k books from the Kansas City Public Library along with servers to India. The Indian government scanned the appropriate books. The Internet Archive has performed automated conversion of these scans into this collection.
This project was initialized from a template: https://github.com/learningequality/cookiecutter-chef/
Install Python 3 if you don't have it already.
Install pip if you don't have it already.
Create and activate a pipenv
Python virtual environment for this project. https://docs.pipenv.org
Run pip install -r requirements.txt
to install the required python libraries.
TODO: Explain how to run the 'Internet Archive - Universal Library' chef
export SOMEVAR=someval
./script.py -v --option2 --kwoard="val"
A sushi chef script is responsible for importing content into Kolibri Studio from the Internet Archive Universal Library. The Rice Cooker library provides all the necessary methods for uploading the channel content to Kolibri Studio, as well as helper functions and utilities.
A sushi chef script has been started for you in sushichef.py
.
Sushi chef docs can be found here.
_For more sushi chef examples, see examples/openstax_sushichef.py
(json) and
examples/wikipedia_sushichef.py
(html) and also the examples/ dir inside the ricecooker repo._
Please make sure your final chef matches the following standards.
source_id
s determined consistently (based on foreign database identifiers or permanent url paths)?if
or for
loops?path
vs p
)?