Closed ormsbee closed 2 years ago
@pdpinch: I would love to get your thoughts to kick us off on this one, since you folks work so closely with this.
@ormsbee I'm a bit confused about the relationship of bundles, collections and courses. Is there a sketch of this somewhere?
To kick off the discussion, I was thinking that exporting a course would yield a group of bundles -- each a directory containing an OLX file and a sub-directory of static assets. However, with ownership metadata in the collection, I'm not sure if this is what you had in mind?
@pdpinch: Collection:Bundle is 1:M, and let's assume for now that there will be one Collection that holds all the Bundles in a typical course, but that borrowed content might come from Bundles in other Collections that represent Content Libraries.
There's a lot of conversation in #16 in terms of what that breakdown of Bundles would look like. But for the purposes of this thread, I'd rather not start with the Bundle mapping. Let's assume that there's going to be some amount of data transformation between the import/export format and the representation of the pieces in Blockstore.
I think this represents a bit of a shift in mindset. Previously, Blockstore data at-rest was envisioned to be the author-intent editing bundles of files. But the direction of #16 looks like we're talking more about Blockstore data at-rest being in a format that is more about facilitating re-use (so more granular, more regular). Which pushes more burden to a conversion/mapping import-export layer between the two. But I think that once we're committing to such a layer existing, we get a lot more freedom in defining exactly what the export format looks like irrespective of the storage details.
Does that make sense?
So I think the semantics we have to worry about are:
Keep in mind that it doesn't have to be strictly OLX as we know it today.
There are some things that have been good about OLX -- HTML concepts are familoar to course authors; XML isn't a big leap; for the most part the block names are self-describing. The problems with OLX have most been structural, and with naming. The course is tree-like, but the export is not. It's hard to find things. It's nearly impossible to move or copy anything but the leaf nodes. Copying static assets is similarly difficult.
• How do we associate static assets with individual leaf XBlocks (e.g. capa problem, HTML block) -- probably a similar mechanism for precursor files as well?
I'd suggest a static directory in association with leaf xblock. It would be nice, for example, if opening a HTML block in a browser would just render, although that may be too much to ask. Precursor files need some association with their representation. For latex, there’s a convention for referencing the orignal file as an XML attribute on the node. (need example)
• How do we call out things that need to be shared within the course itself (e.g. Python code libraries).
This is an interesting question. The fact that python libraries are shared across the course is a convenience, but they are referenced more like static files, at the leaf node. So one response is that they should be handled just like other static assets — although the current references (from mitxgraders import FormulaGrader) are going to be hard to parse.
I think an even more interesting question is: how does the export reference reused assets (static, or otherwise). I’d like something that gives me all the data in an export, but otherwise avoids duplication. This should be the goal in the export, as well as in the blockstore after an import.
• How do we distinguish in the export the content that you can edit vs. the content that you're borrowing from elsewhere and may have a read-only view of.
metadata of some kind. Are there any existing conventions?
• What does the roundtrip look like (e.g. do we have a canonical export format and more freeform import format)?
What’s feasible here? The canonical export, freeform import of the current OLX is useful, but I don’t know how much of that was deliberate and it strikes me as difficult ot maintain.
• How do we cleanly separate policy-related items from content in a way that's simple, extensible, and doesn't look terrible?
We probably need to start by identifying what are policy related items. I presume though that you are talking about dates, grading policy, etc. Much as I’d like to institute a strict separation of these, there are use cases that we wouldn’t want to break — like exporting the course, increasing the dates by a year, and then importing it.
This is a rambling response, but you asked a lot of questions. Let me know if this is helpful, or if you'd prefer something more carefully though out.
This is an interesting question. The fact that python libraries are shared across the course is a convenience, but they are referenced more like static files, at the leaf node. So one response is that they should be handled just like other static assets — although the current references (from mitxgraders import FormulaGrader) are going to be hard to parse.
If that's going to be the case, then we'll need something else to distinguish between static assets that users can download and ones that are for internal usage. Though maybe that doesn't have to happen at this layer...?
How do we distinguish in the export the content that you can edit vs. the content that you're borrowing from elsewhere and may have a read-only view of.
metadata of some kind. Are there any existing conventions?
We're making this up as we go along, but I think it might be better to have a place where they're cleanly separated in the file system, especially if there's hand-editing going on. It's one thing to check a metadata JSON file, it's another to be editing a path called /readonly_borrowed_content/{usage_key}.xml
(I realize that is an absolutely horrible name -- I merely include it here for the obviousness).
I think an even more interesting question is: how does the export reference reused assets (static, or otherwise). I’d like something that gives me all the data in an export, but otherwise avoids duplication. This should be the goal in the export, as well as in the blockstore after an import.
Yeah, I think we're going to have to give reused content its own explicit, top level space. But not sure beyond that.
What does the roundtrip look like (e.g. do we have a canonical export format and more freeform import format)?
What’s feasible here? The canonical export, freeform import of the current OLX is useful, but I don’t know how much of that was deliberate and it strikes me as difficult to maintain.
I think some level of freeform import is feasible as long as the path to a canonical format is simple. For instance, say the import mechanism always expected a giant XML course file as the contents of the entire course. For hand authoring, that could look like this:
<course>
<chapter id="blockstore"> <!-- id means "usage_key" here -->
<!--
The following could be inline or XIncludes of sequence files.
Since usage_key is derived from the XML within the file, there's
no restriction that the filename has to be the usage key, though
it might be a nice convention.
-->
<xi:include href="blockstore/early_modulestore_history.xml" />
<xi:include href="blockstore/requirements.xml" />
<xi:include href="blockstore/storage_granularity.xml" />
<xi:include href="blockstore/data_transforms.xml" />
<xi:include href="blockstore/import_export.xml" />
</chapter>
<!-- etc... -->
</course>
The xi:includes is an XML convenience and not OLX parsing. So the import code would only see the course.xml
after all the includes have been processed, and then it can do its validation and separation into discrete components based on that. That would give hand authors a fair amount of structural flexibility without significantly complicating the import code.
I'm still really fuzzy on how to do static assets in a good way though.
I'd suggest a static directory in association with leaf xblock. It would be nice, for example, if opening a HTML block in a browser would just render, although that may be too much to ask.
That would be pretty cool, though I'm not sure how that interacts with the desire to allow flexibility of placement on the authoring side. Most blocks aren't going to have any associated static assets at all. Though maybe we can keep a bit of both by allowing XML placement to be freeform but have conventions that all static assets are at the root of the export and sub-divided by something derived from the usage key:
course.xml
static/
html/
compositor_overview
/diagram.png
Since every leaf has to have some kind of usage key identifier (in some way or another), it's straightforward to know where the assets are going to be, regardless of how the XML is structured. Opening an HTML file wouldn't render the images correctly without a little post-processing, but I think that's still an acceptable tradeoff.
We probably need to start by identifying what are policy related items. I presume though that you are talking about dates, grading policy, etc. Much as I’d like to institute a strict separation of these, there are use cases that we wouldn’t want to break — like exporting the course, increasing the dates by a year, and then importing it.
Yup, those are exactly what I had in mind. Braden created a Compositor Architecture Proposal a while back, to map out where those values get applied. I think that supporting them as part of the import/export flow is feasible, so long as they're:
The second of those is a long held rant of mine, but I basically think that there needs to be a separate system for Scheduling that has the course staff set dates as inputs, but also has other inputs like individual due date extensions, and an ability to query in a cross-course manner. But that's a bit of a tangent. I think it's enough to say that it's different enough with our current known use cases (re-runs, CCX, content libraries) that it deserves clearer separation in the data model than exists today.
This is a rambling response, but you asked a lot of questions. Let me know if this is helpful, or if you'd prefer something more carefully though out.
I think at this point in our discussion, low latency rambling is more valuable than high latency proposals. :)
I was thinking about these use cases and can see them falling into 3 categories:
Which pushes more burden to a conversion/mapping import-export layer between the two. But I think that once we're committing to such a layer existing, we get a lot more freedom in defining exactly what the export format looks like irrespective of the storage details.
The way I have been thinking about 3 is how programming languages work. For example, there is a Python runtime which expects the data to be in a certain format in memory. And then there is a parser/compiler layer which gives humans a lot of freedom to organize the code the way it makes sense for them. So one question I have is (especially for @pdpinch), what if Blockstore had APIs and there was a command line tool that interacted with them? So it could, lets say, pull out the OLX and static files it needed and depending on the internal structure, write them out into the local file system the way it made sense for that type of content. For example, the filenames could become <unit_display_name>_<block_display_name>.id.olx
. Or the olx files could be organized into a separate directory for each chapter or chapter (this could even be configurable). Similarly, the tool could re-read this data, run validations and push the changed parts to Blockstore. Heck, it could even have a watch option, which would on every edit push the content to the devstack Blockstore and let you preview things in Studio/LMS automatically. In other words, everything that is possible with Webpack.
(Of course API calls would be rate-limited by user)
Blockstore data at-rest was envisioned to be the author-intent editing bundles of files.
I do think this is still mostly true for the idea above. The main difference is that instead of the author running tar -zxvf export_file
to see the directory of content, they run blockstore pull collection_id --export-format-configuration=<config>
. And since the code for the transformer is going to be outside Blockstore it can evolve much faster.
Is this too radical an idea? What are the downsides? We will of course still need MF format for the category 2 of use cases.
Here is which categories I think these use cases fall in:
Rerunning (v1) A course team created an empty Studio ‘shell’ for their rerun months ago, and they’re ready to finally load the content of the current run into that shell today (they’d copy over their current course automatically if they could).
MF format should be sufficient though they should just be able to point to a course.xml version of a fork in Blockstore.
Rerunning (v2) Course team may have originally created a rerun via auto-copying their current run, but some other version of their course turned out to be better.
MF format should be sufficient though they should just be able to point to a course.xml version from a fork.
Backups I’m going to do something risky to my course and I need a backup copy - the import needs to be exactly the same to preserve location-ids / access to student data
No need since can always go back to an older version of the course directly.
Course division / course chimera (rare) I’ve run one giant 18-week course, and I’d like to split it into 3 small courses without copy/pasting everything. Or, I’ve run 3 small courses and I want to repackage them as 2 new courses.
No need since can just combine the chapters and sequences into a new course.
Moving content between multiple instances of the platform Most commonly, this involves moving courses from test runs on Edge to MOOC runs on edX.org. Some course teams also move content from their own instances to edX controlled ones and vice versa.
MF format. Though I think the ability to link an edX app instance to other Blockstores may be a better solution for this.
Libraries Teams want to use their MOOC problem banks on campus or vice-versa
Same as previous point.
XML editing outside of Studio Conditional modules, changing a course’s wiki slug, adding user-readable unit URLs, etc.
Would be simpler to have a file editor in Studio.
Retrieving files (rare) I’ve uploaded a bunch of assets to Studio months ago and now I want copies of them -it is easier to get them via export than by clicking on each.
MF format should be sufficient?
Retrieving data (rare) I want a bunch of info from my course that’s hard to find click-wise, so I’ll export it and use the xml: e.g. ‘what are all the youtube ids? Where did I say ‘week’ instead of ‘lesson’ as I update to convert to self-paced’?
The HF format may be useful here but this is a rare use case.
Seeding a new course (rare) I have a introductory sequence that I’d like to appear in a lot of my courses - I’ll import this content and then build the rest of my course.
No need since can just link to those sequences in multiple courses.
@symbolist distinguishing between human-friendly and a machine-friendly formats is useful and, I think, consistent with what @ormsbee was suggesting. I am also in favor of having APIs for importing and exporting elements -- we've wanted that for some time and I know of 3 (no, 4!) different ways folks have hacked that together.
I think you're misunderstanding one of the use cases, and the list is missing another.
"XML editing outside of Studio" isn't typically about editing a single XML file. It's about doing some kind of manipulation that isn't possible in Studio. I suppose if you added a file editor that would probably cover some of the use cases, but certainly not all. Editing outside of studio could use a MF format, but I think a HF format would be better.
The use case that is missing from this list is converting content from another format (latex, markdown) into a format edX can consume (could use MF or HF format). edX lives in a ecosystem and it's not unusual for folks to wanted to convert to and from its format into others.
Sorry, I'm replying of order.
@ormsbee:
we'll need something else to distinguish between static assets that users can download and ones that are for internal usage.
I don't understand this. I think the python grading library should be downloaded just like other static assets. I wouldn't expect to be able to run it, except after uploading it to an instance of edx-platfom, but otherwise I see it as just another static asset.
it might be better to have a place where they're cleanly separated in the file system
I think we could live with that. I'd still like HTML to render locally if possible, but there are ways to make that work even with "/readonly_borrowed_content/{usage_key}.xml"
maybe we can keep a bit of both by allowing XML placement to be freeform but have conventions that all static assets are at the root of the export and sub-divided by something derived from the usage key:
That's certainly clear for finding static assets. How would (manual) reuse work though? I copy the OLX I want out and then grab a copy of the corresponding static folder?
I'll go read up on the Compositor Architecture but I think separating policy from content will be fine. Folks are already accustomed to the separation between the OLX and the policy.json.
@symbolist:
I do think this is still mostly true for the idea above. The main difference is that instead of the author running tar -zxvf export_file to see the directory of content, they run blockstore pull collection_id --export-format-configuration=
. And since the code for the transformer is going to be outside Blockstore it can evolve much faster. Is this too radical an idea? What are the downsides? We will of course still need MF format for the category 2 of use cases.
That's really interesting. I had a tiny blurb in the original design doc that having a CLI Blockstore utility to manage downloads might be necessary in order to tie all the links together and present it in a usable way. But I envisioned that as a generic Blockstore utility without deep awareness of what course content is. If I'm understanding correctly, this goes a step beyond that and proposes that the mapping logic of how to translate local files to Blockstore moves to the client, translating user intent into relatively low level Blockstore operations. I think this has some some strong arguments in both directions, and I'd really like to pursue this line of thinking.
Things I love about it:
Things that concern me:
Smaller details:
@pdpinch:
I don't understand this. I think the python grading library should be downloaded just like other static assets. I wouldn't expect to be able to run it, except after uploading it to an instance of edx-platfom, but otherwise I see it as just another static asset.
Yeah, I'm conflating shared Python libs with custom response Python code that might leak answers. But the latter doesn't need to be treated as a static asset, so please disregard.
That's certainly clear for finding static assets. How would (manual) reuse work though? I copy the OLX I want out and then grab a copy of the corresponding static folder?
Can you describe in more detail the specific re-use scenario here? I just want to make sure I understand the question.
@ormsbee I opened a new issue for "re-use of exported content"
I'm tempted to do the same for "blockstore import/export client" but maybe you want to let that take over this thread.
@pdpinch
I am also in favor of having APIs for importing and exporting elements -- we've wanted that for some time and I know of 3 (no, 4!) different ways folks have hacked that together.
Ah, interesting!
I think you're misunderstanding one of the use cases, and the list is missing another.
"XML editing outside of Studio" isn't typically about editing a single XML file. It's about doing some kind of manipulation that isn't possible in Studio. I suppose if you added a file editor that would probably cover some of the use cases, but certainly not all. Editing outside of studio could use a MF format, but I think a HF format would be better.
The use case that is missing from this list is converting content from another format (latex, markdown) into a format edX can consume (could use MF or HF format). edX lives in a ecosystem and it's not unusual for folks to wanted to convert to and from its format into others.
Oh, I am aware of these. The list above is from another document which was compiled some time ago and I thought the point in the list was only about a more restricted version of the idea. But I may have read it incorrectly so thanks for detailing these out!
@ormsbee
If I'm understanding correctly, this goes a step beyond that and proposes that the mapping logic of how to translate local files to Blockstore moves to the client, translating user intent into relatively low level Blockstore operations.
Yup! As you said it would be kind of like a command line version of Studio!
I do want to add one point to your very nice list of positives and that is evolvability. One of the things about putting the transform logic for HF format in Blockstore is that it would also have to double as the MF format that would be needed for transport and archival purposes and therefore must be stable and that would mean the HF format would be for a long time (close to) what we decide in the next few months. Like Studio, if the HF format is allowed to evolve freely as feedback comes in with usage, without any concern for backwards compatibility ("just update to new version of client and pull the collection again") and as new types of content get developed, it would end up being a lot more author-friendly.
Things that concern me:
The mapping of a Course to Blockstore data constructs needs to be elevated to an API contract, making it more difficult to change our minds about data modeling at a later point. Maybe that was inevitable anyway, if we wanted to treat Blockstore as a first class interface for manipulating this data rather than an implementation detail for Studio to do so.
I worry that the client code won't be maintained, or that we'll see drift between it and what's run by Studio. I imagine we would want Studio to use this library to some extent to help prevent that, but I still see drift as a concern.
Hmm. These two will definitely need thinking.
- Maintaining a program that has to work locally on everyone's machine is going to be a maintenance issue. There's always someone running it on random distro with weird restrictions, and of course it should work on my Mac but which of the eight Pythons is it really hooked into, and "hey, lxml isn't compiling because it can't find my libxml2 lib", etc.
There are tools for converting packages to single file executables with the interpreter + dependencies included. Would something like that help?
@pdpinch:
I'm tempted to do the same for "blockstore import/export client" but maybe you want to let that take over this thread.
Yeah, I think it's okay to let it take over the thread. If we land on consensus that we want a rich client like that, then a new issue makes sense to hammer out some more of the specifics.
@symbolist:
There are tools for converting packages to single file executables with the interpreter + dependencies included. Would something like that help?
Probably? I don't know the state of the art on that these days for the Python world. The one place I worked at that needed to tackle these issues eventually gave up on Python and rewrote their agent in Go so that they could compile a static binary and be done with it. Which is not something I'm advocating for, fwiw.
Okay, I think that after having the Thanksgiving weekend to stew on this, I'm +1 on the dedicated CLI client, and pushing the mapping logic to Bundle primitives out of Blockstore.
@bradenmacdonald In case you haven't been following this, can we get your opinion too?
@ormsbee I see you had productive holidays! 😄
@pdpinch @ormsbee @symbolist
I really like the idea of having a CLI tool for working with a local HF format and syncing it with Blockstore, keeping the Blockstore format relatively efficient for developers+reuse+writing (and eventually yet a third read-optimized format when pushing from Studio/Blockstore to the LMS...?)
I was going to suggest that the CLI tool be written in Go or Rust so that it's (a) even more fun to write, and (b) much more portable, but that would make sharing code with Studio much more difficult. I do personally like the approach of writing a simple library in C or Rust with nice bindings for Python (à la libgit2), to keep a consistent approach. However, I see you've mentioned those ideas already. It probably makes sense to stick to Python here throughout the stack, but it's not what I personally would gravitate toward.
Another approach that I was thinking of which can help less technical users is to use an online IDE for viewing+editing+syncing the HF format. If we found some existing one like GitLab's which could be adapted really easily (so you just use it as-is with a little layer to sync to/from Blockstore), that would be a huge win. Then you get the same effect, but people don't need to install any software on their computer. I don't know if that's a medium-sized project or a mammoth one though.
The course is tree-like, but the export is not. It's hard to find things.
Yeah I definitely want to see the HF format having a hierarchy that matches the course. My suggestion would be something like:
/course.xml
/policies.json
/chapter1.xml
/chapter1/unit1-1.xml
/chapter1/unit1-1/intro-to-the-course.xml
/chapter1/unit1-1/first-problem.xml
/chapter1/unit1-1/first-problem/some-image.svg
/chapter1/unit1-1/first-problem/some-other-image.svg
/static/some-image-used-throughout-the-course.svg
/static/python-shared.zip
This structure avoids the things I hate about ansible: it doesn't require creating subdirectories unless they're needed (i.e. blocks without static assets don't need their own directory), and it avoids lots of similarly named files throughout the tree (an editor with a dozen "unit.xml" or "index.olx" tabs open is annoying).
For the shared static assets, it might be reasonable to actually symlink them into the folder where they are used, in order to track that better. After all, all operating systems today support symlinks, including windows. For people who don't know how to create symlinks manually, they can just copy the files and the import/sync process will dedup them.
Any item that includes a read-only link to a child from another bundle should probably not pull that child down in most cases, so e.g. if chapter1.xml contains a link to a unit2
from another bundle, then chapter1.xml could just contain a reference to that external bundle, but /chapter1/unit2/
would not even appear as a directory. Alternately, since all OSs support creating read-only files, the CLI tool could pull them down like this:
/chapter1.xml
/chapter1/unit2.xml -> /external/{bundle_uuid or alias}/unit.xml
/external/{bundle_uuid or alias}/unit.xml (read-only)
/external/{bundle_uuid or alias}/unit/image.svg (read-only)
I was going to suggest that the CLI tool be written in Go or Rust so that it's (a) even more fun to write, and (b) much more portable, but that would make sharing code with Studio much more difficult. I do personally like the approach of writing a simple library in C or Rust with nice bindings for Python (à la libgit2), to keep a consistent approach. However, I see you've mentioned those ideas already. It probably makes sense to stick to Python here throughout the stack, but it's not what I personally would gravitate toward.
Yeah, don't get me wrong, I really want to write it in Rust because Rust is fun and it's what my hobby command line utilities are written in. But I'm much better at Python than Rust, and it doesn't give such a huge advantage over Python that it justifies the added burden of a new language that will be unfamiliar to most Open edX developers.
Another approach that I was thinking of which can help less technical users is to use an online IDE for viewing+editing+syncing the HF format. If we found some existing one like GitLab's which could be adapted really easily (so you just use it as-is with a little layer to sync to/from Blockstore), that would be a huge win. Then you get the same effect, but people don't need to install any software on their computer. I don't know if that's a medium-sized project or a mammoth one though.
That sounds cool, but yeah, potentially a lot of work. Sounds like a great hackathon project though.
/chapter1.xml /chapter1/unit1-1.xml /chapter1/unit1-1/intro-to-the-course.xml /chapter1/unit1-1/first-problem.xml /chapter1/unit1-1/first-problem/some-image.svg /chapter1/unit1-1/first-problem/some-other-image.svg
I'm not sure if I completely understand... In this scenario, can you please give an example of what the contents of chapter1.xml
and unit1-1.xml
might look like? And in particular, what they do with small blocks that don't necessarily have static assets, like html
, discussion
, or conditional
?
For the shared static assets, it might be reasonable to actually symlink them into the folder where they are used, in order to track that better. After all, all operating systems today support symlinks, including windows. For people who don't know how to create symlinks manually, they can just copy the files and the import/sync process will dedup them.
I think I made some blurb in the original design docs about potentially using symlinks to stitch together linked Bundles on the client side, so I get where you're coming from. But at the same time, I'm skittish on symlinks in general. Even if they're supported on Windows 10, there are a lot of places where people can get tripped up with compatibility issues. For instance, I spent a while spinning my wheels on the path to getting edx-platform to work on Windows because I didn't realize that the git client for Windows by default doesn't create symlinks (it's disabled in configuration). File watching utilities sometimes do or don't follow symlinks. Folks who use Cygwin on Windows are either creating real, native Windows symlinks or it's own specially formatted files that emulate symlink behavior (created in the dark days before Windows really supported it). Run the wrong Python and you have a really annoying-to-debug issue.
Also, a lot of folks really don't get what it means and are confused that changes in one directory affect the other. Or some people get really symlink happy and you end up with symlink spaghetti going back and forth from your static dir to the leaves and back again. So if we do use them in the human-friendly format, I'd like to be fairly strict and limited in how they're allowed to be used.
This is old and most likely no longer relevant to the future platform direction, so I'm going to close it. The discussion will be preserved on GitHub for future reference if useful.
@pdpinch, @JAAkana, @bradenmacdonald, @symbolist, @pomegranited:
Extracted out of issue #16: What is the ideal course import/export format, keeping in mind that we will have the ability to associate assets with individual leaf XBlocks.
We know that a number of technically advanced course teams generate OLX from existing banks of problems or do all their authoring in XML and use import/export as their primary publishing workflow. However, there are some other use cases we also need to keep in mind...
Other Known Use Cases
A list of known import/export use cases compiled by @JAAkana:
Rerunning (v1)
A course team created an empty Studio ‘shell’ for their rerun months ago, and they’re ready to finally load the content of the current run into that shell today (they’d copy over their current course automatically if they could).
Rerunning (v2)
Course team may have originally created a rerun via auto-copying their current run, but some other version of their course turned out to be better.
Backups
I’m going to do something risky to my course and I need a backup copy - the import needs to be exactly the same to preserve location-ids / access to student data
Course division / course chimera (rare)
I’ve run one giant 18-week course, and I’d like to split it into 3 small courses without copy/pasting everything. Or, I’ve run 3 small courses and I want to repackage them as 2 new courses.
Moving content between multiple instances of the platform
Most commonly, this involves moving courses from test runs on Edge to MOOC runs on edX.org. Some course teams also move content from their own instances to edX controlled ones and vice versa.
Libraries
Teams want to use their MOOC problem banks on campus or vice-versa
XML editing outside of Studio
Conditional modules, changing a course’s wiki slug, adding user-readable unit URLs, etc. These are sometimes edits that either cannot be supported by the Studio UX at all (because no appropriate handler was written for them), or else are really cumbersome to do with existing editors (e.g. course-wide search and replace).
Retrieving files (rare)
I’ve uploaded a bunch of assets to Studio months ago and now I want copies of them -it is easier to get them via export than by clicking on each.
Retrieving data (rare)
I want a bunch of info from my course that’s hard to find click-wise, so I’ll export it and use the xml: e.g. ‘what are all the youtube ids? Where did I say ‘week’ instead of ‘lesson’ as I update to convert to self-paced’?
Seeding a new course (rare)
I have a introductory sequence that I’d like to appear in a lot of my courses - I’ll import this content and then build the rest of my course.
Translating from Another File Format
Converting content from another format (latex, markdown) into a format edX can consume (could use machine or human-readable format). Open edX lives in a ecosystem and it's not unusual for folks to wanted to convert to and from its format into others.