adlnet / xAPI-Spec

The xAPI Specification describes communication about learner activity and experiences between technologies.
https://adlnet.gov/projects/xapi/
909 stars 404 forks source link

Binary (potentially large) data in statements #23

Closed bscSCORM closed 11 years ago

bscSCORM commented 11 years ago

CMI 5 requires the ability to include binary data in a learner's transcript.

This could be a PDF completion certificate, and image of the students work, an audio clip (simulated communication with ATC for example), even a (hopefully short) video of their work.

In order to avoid cluttering up results when a list of statements is pulled, we might consider an "attachment" approach similar to email.

Each attachment would be added to a statement with the title (possibly a brief description), the content-type (mime type) of the attachment, and the size of the attachment.

Clients querying an LRS for a statement list would be able to specify a parameter to exclude attachments , and only receive the above metadata.

The "exclude attachments" mode SHOULD NOT be attempted for transfer from one LRS to another, and an LRS MUST NOT accept statements with attachment headers but no attachments.

The attachment data itself would be stored as BASE64 in the statement, or as a link to where the attachment data can be found along with a hash of the data.

garemoko commented 11 years ago

Agree that this would be a very helpful use case to support. Makes the lrs an eportfolio as well.

bscSCORM notifications@github.com wrote:

CMI 5 requires the ability to include binary data in a learner's transcript.

This could be a PDF completion certificate, and image of the students work, an audio clip (simulated communication with ATC for example), even a (hopefully short) video of their work.

In order to avoid cluttering up results when a list of statements is pulled, we might consider an "attachment" approach similar to email.

Each attachment would be added to a statement with the title (possibly a brief description), the content-type (mime type) of the attachment, and the size of the attachment.

Clients querying an LRS for a statement list would be able to specify a parameter to exclude attachments , and only receive the above metadata.

The "exclude attachments" mode SHOULD NOT be attempted for transfer from one LRS to another, and an LRS MUST NOT accept statements with attachment headers but no attachments.

The attachment data itself would be stored as BASE64 in the statement, or as a link to where the attachment data can be found along with a hash of the data.

— Reply to this email directly or view it on GitHub.

bscSCORM commented 11 years ago

Question from: @nhruska

Using Binary data in an activity stream is a feature already defined here: http://activitystrea.ms/specs/json/schema/activity-schema.html#binary

I know we discussed putting binary data in a few different places within the statement (and allowing flags to include/exclude binary docs during querying).

QUESTIONS:

  1. Should this be supported as-is in the Experience API? i.e. using the 'binary' objectType and having the binary file be the 'object' of the statement?
  2. They have additional properties defined besides title, contentType, and size - 'compression', 'data', 'fileUrl', and 'md5'. I propose that we use these as well so we are more fully supporting activity streams.
bscSCORM commented 11 years ago

Good find, I wasn't thinking of this when I suggested our attachment type, but they came out quite similar (there aren't too many reasonable ways to approach this problem).

We might as well use their field names where the concepts are identical. Their compression concept makes sense, may as well limit the size of that base64 string, and still let the consumer know what the uncompressed data type is, so we should pull that in.

I figured we'd need size anyway (https://github.com/adlnet/xAPI-Spec/issues/23), but wasn't thinking about it in combination with compression, what they're doing here makes sense.

Since they support md5, we should as well to support conversion, however I think we should call it out as only included for activity streams compatibility, and that activity providers SHOULD use SHA (1 or 2, we can tell them apart by hash length). MD5 is OK to check for any accidental file corruption or duplication, but is thoroughly broken from a security point of view.

I had suggested we store either the link -or- the actual data, I suppose we can store both to match activity streams.

garemoko commented 11 years ago

Can this issue now be closed in favor of PR https://github.com/adlnet/xAPI-Spec/pull/57 or is there more to add on this?