learningequality / ka-lite

KA Lite: lightweight web server for serving core Khan Academy content (videos and exercises) without needing internet connectivity
https://learningequality.org/ka-lite/
Other
458 stars 305 forks source link

New content-syncing approach needed for Perseus exercises #2257

Closed jamalex closed 9 years ago

jamalex commented 10 years ago

With khan-exercises, we just included everything directly in the repo/installer. With Perseus, there are currently 17K+ questions, many with pre-rendered images (700+MB), which becomes unwieldy.

We might consider breaking up the Perseus content into subtopic blocks, which can be downloaded into an installation in a similar fashion as we do for videos. These content packs could include all the question data and images for a portion of the topic tree. They could also be downloaded separately on a per-language basis, as with language packs.

This doesn't need to be a blocker for merging the hackathon code into the develop-all-the-things branch, but as we've excluded the full assessmentitems.json file from the repo for now, not many exercises will work until this is fixed.

mjptak commented 10 years ago

I've been following your sisyphean task. So does Perseus content "kinda" render now on KALite?

jamalex commented 10 years ago

It does! On the hackathon branch. But you'll need to download this file on top of the assessmentitems.json file in the repo, to get all the questions. And khan-exercises, conversely, doesn't work anymore. :D

Since you invoked Sisyphus (which is very apt, when discussing trying to keep up with the KA API, right @rtibbles?), here's a poem I wrote in Grade 11:

The Wise One

Is Sisyphus bound by sense of duty? What else could keep him on his track, Of fruitless striving, eternal path? He knows (he must) that all's for naught. Yet as a spinnstress ever-weaving, Losing thread with each new stitch, He knows (he must) that nothing's brought To plain existence, but start again, Not only sees his foregone failure, But knows that this one's sure to rest As all the others, in the valley, Of his river ever-weeping, He knows he can't succeed, -- nor halt.

mjptak commented 10 years ago

So my daughter is corresponding with a friend in Rio who was an exchange student on her soccer team and I was explaining to them where you all are heading at least as reported in EdSurge. So if they are going to develop THEIR 10th grade centric local knowledge course to be included in the system when it diverges from the Palo Alto path, is Perseus the engine they should figure out? (and when will Perseus allow sound?). Seems pretty nifty to me.

rtibbles commented 10 years ago

We don't have an authoring solution yet for the Perseus content, and I don't think KA have open sourced theirs yet, but I imagine they would be open to it.

mjptak commented 10 years ago

maybe not open sourced but definitely open for the 30% with the internet... ....

http://khan.github.io/perseus/

On Fri, Aug 29, 2014 at 11:15 AM, Richard Tibbles notifications@github.com wrote:

We don't have an authoring solution yet for the Perseus content, and I don't think KA have open sourced theirs yet, but I imagine they would be open to it.

— Reply to this email directly or view it on GitHub https://github.com/learningequality/ka-lite/issues/2257#issuecomment-53897380 .

rtibbles commented 10 years ago

Oh, I think it is - but it might need a bit more work to act as an editor for anyone.

mjptak commented 10 years ago

agreed..though i have found this one useful when "crafting" exercises..

http://graphie-to-png.kasandbox.org/ or just as a general resource.

rtibbles commented 9 years ago

We will also need this for including Computer Science content. The Scratchpad data is fairly extensive.

aronasorman commented 9 years ago

Working on this, with help from @jamalex.

aronasorman commented 9 years ago

Based on my talk with @jamalex, here's the plan:

Turns out that all the assessment item resources might be bigger than 200MB, as some of the images are stored in the item_data.items attribute, which doesn't seem to be read and downloaded. Our estimate is that it's ~700MB.

The approach we talked about is to break out the KA assessment item resources into per-subject zip files. The script that will generate those files will be written in the KA Lite distributed repo (probably through a generate_assessment_zips mgmt command), then called by the build server to generate said zip files.

After that, there'll be another management command, downloadassessmentassets, that downloads a subject zip file and places its contents into the content/ folder. I'll also extend contentload to rewrite the urls to point to the local versions.

rtibbles commented 9 years ago

Damn, missed the items attribute, good catch. Have pushed up my most recent code to my branch now.

aronasorman commented 9 years ago

Zipping up the assessment items is almost done (with some bugs found by @jamalex). Now working on downloading the zip. After that, I'm working on splitting up the assessment item resources into their own zips by topic.

aronasorman commented 9 years ago

Assessment item resources is not that big (200 MB), so we might just bundle it as part of the installation, or allow them to download it through the UI that @MCGallaspy is building.

aronasorman commented 9 years ago

We decided that we're gonna bundle in the assessment resources as part of the installers. The zipped up assessment resources is around ~360 MB, and that would be the lower bound for the size of our installers.

For the git users, our migration path for them would be to run a command called unpack_assessment_zip, which should be given to them through an upgrade note as part of the documentation (cc @MCGallaspy).

aronasorman commented 9 years ago

Fixed in #2896.