adlnet / xAPI-Spec

The xAPI Specification describes communication about learner activity and experiences between technologies.
https://adlnet.gov/projects/xapi/
909 stars 404 forks source link

Definition of Documents is unclear, concerns about abuse #1102

Open rickbatka opened 2 years ago

rickbatka commented 2 years ago

I'm implementing an LMS + LRS that supports xAPI. In reading the spec there are a few things I'm finding unclear about "Documents" and "State". I'll post a separate issue about "State" and just focus on "Document" here.

I'm having trouble understanding what kinds of data I need to support in my "Document" resource.

Part 2 Section 1.0 states:

The Experience API provides a facility for Learning Record Providers to save arbitrary data in the form of documents. This data is largely unstructured, which allows for flexibility. Specifics on document behaviors can be found in Part 3

And then, in part 3, there is a circular link back to part 2 section 1.0 (emphasis mine):

The Experience API provides a facility for Learning Record Providers to save arbitrary data in the form of documents, perhaps related to an Activity, Agent, or combination of both. ... ... The three Document Resources provide document storage. The details of each resource are found in the following sections, and the information in this section applies to all three resources.

The spec then goes on to describe in great detail how to handle merging JSON documents when a document with mime-type 'application/json' is posted to an existing document of type 'application/json'.

This leaves me with several questions. Are they answered elsewhere in the spec? If not, should some language be added to clarify?

My questions:

  1. Is this endpoint meant to only store JSON documents, or is it truly a place to upload any kind of file?
  2. If it's meant to store any arbitrary document format, are there whitelisted or blacklisted mime types? It seems crazy that I'd provide a general purpose file upload bucket on my server with a client-facing API, for security and abuse reasons.
  3. Should a compliant LRS limit the file size of uploaded documents?
  4. Should a compliant LRS implement some sort of rate limiting to prevent abuse?
  5. Why does the Document recourse have special handling for merging JSON, when the State resource also exists? Isn't State a better place for storing arbitrary JSON? As far as I can tell, the Document resource doesn't have any special behavior for any other file types - is my reading correct?

Thanks for helping me understand! The Document resource seems like a bit of a security nightmare, to be honest - I hope I can come to understand this a bit better.

thomasturrell commented 2 years ago

Is this endpoint meant to only store JSON documents, or is it truly a place to upload any kind of file?

Any

If it's meant to store any arbitrary document format, are there whitelisted or blacklisted mime types? It seems crazy that I'd provide a general purpose file upload bucket on my server with a client-facing API, for security and abuse reasons.

No, there are no whitelisted of blacklisted content types.

Should a compliant LRS limit the file size of uploaded documents?

No there is no requirement to do that. See issue #1088, Brian Miller makes some very good observations that might be relevant to you.

Why does the Document recourse have special handling for merging JSON, when the State resource also exists?

The state resource is a document resource. There are several different types of document resources.