hypothesis / vision

Envisioning the future of the Hypothesis.
https://github.com/hypothesis/vision/issues/
40 stars 8 forks source link

Create a way that annotated documents can be embedded on web pages. #12

Open dwhly opened 10 years ago

dwhly commented 10 years ago

DocumentCloud is a popular open source project and service that allows reference documents to be uploaded and then annotated and embedded for viewing inside web pages.

Here is an example: http://www.techdirt.com/articles/20131031/12394625090/feinstein-releases-fake-nsa-reform-bill-actually-tries-to-legalize-illegal-nsa-bulk-data-collection.shtml

There are some issues with this model. 1- The original source URL is no longer available as the canonical reference when it's uploaded. 2- DocumentCloud documents can't be annotated by us at present, and they aren't focused on an OA annotation model right now, though we have had discussions with them about possibly helping them integrate in the future.

It might be better if the original reference document-- perhaps the PDF of the bill-- could be embedded (using PDFjs as the rendering engine?) inside a frame in the web page, just like Document Cloud does. So that instead of uploading it to a secondary service, it's streamed into the frame from its original source.

This way, annotations could be laid on top of the embedded document by reference to its proper URL-- the same annotations that were made on top of the document where it lives natively.

csillag commented 10 years ago

Yes, we definitely have to implement embedding (pieces of) documents into other documents, together with the relevant annotations.

I would like to divide this task to two sub-tasks:

Task1 - for existing documents with embedded content: specify the relation

We need to be able to specify (either manually or automatically) if a given part is actually coming from a different document, and when we have this information, we should load the annotations from the original document.

Similar issues in our tracker:

Task2 - for publishing new documents: Provide a way to easily embed content from other documents

This feature is not intended for "mere" readers/annotators; the intended users are bloggers/journalists/other publishers.

Similar issues in our tracker:

Compared the issue above, what you are proposing above is embedding not only individual annotations/highlights, but huge pieces of the original document, or maybe the whole document.

I agree that this should be done by using an iframe (just like embedded YouTube videos).

I am not sure which approach is better:

Approach A) might be easier to implement, because it does not require any server-side services for doing this. However, we might encounter unsolvable problems because of cross-domain restrictions. (Some pages don't like to be framed inside other pages.) Also, this approach does not allow embedding segments of documents; it's all or nothing.

Approach B) is a but more difficult, because for this to work, we have to either store the required document on our server, or provide some kind of proxy service. (For acquiring the wanted content, and passing it the the requesting page.) Furthermore, we need to decide the format to use:

But with approach B), we would not have any cross-domain problems, and we can serve arbitrary ranges of documents, upon request.