adlnet / xAPI-Spec

The xAPI Specification describes communication about learner activity and experiences between technologies.
https://adlnet.gov/projects/xapi/
899 stars 405 forks source link

Any security / abuse guidance for implementers of "State" and "Documents" resources? #1103

Open rickbatka opened 2 years ago

rickbatka commented 2 years ago

Hi,

As referenced in my other ticket, I'm implementing some LRS capability and finding it strange that the spec invites end users (by way of JS in their browser) to freely be able to store and retrieve arbitrary state and documents to / from the server.

It would be trivial to author a JS "bookmarklet" that, when clicked from any page with a cmi5 / xAPI-enabled course on screen, would flood the server with malicious or large files.

As someone new to the space, it seems there should be an "offline" storage option, perhaps using browser local storage or the like. Because as the spec is written, it seems to me that if I want to support cmi5 / xAPI content, I have to have a public-facing general-purpose Document and State store.

Could the spec address this at all? An appendix that talks about hardening these endpoints might be useful. Some things to address (these are legitimate questions I have after reading the spec):

Thanks so much!

thomasturrell commented 1 year ago

Should the State resource validate JSON? Should it sanitize or reject invalid JSON?

This is a good question. I think that we can infer from the spec that the document resources should reject invalid JSON if the content type is application/JSON (note that the content type might be something else).

The reason why I believe this is because section 2.2 Document Resources states:

If the document being posted or any existing document does not have a Content-Type of application/json, or if either document cannot be parsed as a JSON Object, the LRS MUST respond with HTTP status code 400 Bad Request, and MUST NOT update the target document as a result of the request.

If you read it literally you could argue that the first document with a content type of application/JSON does not need to be valid JSON but all subsequent documents do have to be valid but that creates a very strange scenario.

@brianjmiller what is your take on this?

Should the State and Document resources implement file size restrictions? Rate limits for upload / retrieval?

Yes they probably should but that is outside of the scope of the specification. Note that SHOULD and MUST are defined in the specification.

How is the Document store used in practice? Are there a handful of common file types that cover 90% of use cases? (I really hate the idea of letting untrusted client code upload arbitrary files to my server)

Most e-learning authoring tools use the state resource to store the users state. A users state could include progress, score, preferences, personal data, etc. In the wild I have seen courses store a a mix of content types, including binary data.

Will my LRS still function as expected with the majority of AUs created if I disable the Document store entirely? What about the State store? Is there any way to "hint" to the AU that these features are disabled?

I'm assuming you mean cmi5 AU's. No, the cmi5 specification requires the LMS to use the document store to communicate with the launched AU. The market leading authoring tool uses the document store for courses published as xAPI courses.