sagemathinc / cocalc

CoCalc: Collaborative Calculation in the Cloud
https://CoCalc.com
Other
1.17k stars 216 forks source link

implement a PDF annotation feature #2187

Open williamstein opened 7 years ago

williamstein commented 7 years ago

Use cases:

slel commented 7 years ago

The "nb" project from MIT might be useful for this:

Quoting their slogan:

NB is an online collaborative PDF/HTML/video annotation webapp

slel commented 7 years ago

The PDF/HTML/video annotation feature offered by NB would be fantastic to have in CoCalc. It's already available outside CoCalc at http://nb.mit.edu, I am using it there and it works well. I edited my earlier comment, adding links to the code repo and the wiki. Please look into it!

slel commented 7 years ago

See nbproject/nbproject#377 on the nbproject side for CoCalc integration.

williamstein commented 4 years ago

Here's a discussion of a CLOSED SOURCE product that builds on top of pdf.js to provide annotation: https://news.ycombinator.com/item?id=22763656#22764352

This is very good to play with as a proof of concept, since it shows what is possible and how it might feel. I will likely have to just write something from scratch that is open source though, and does at least what we need.

mforbes commented 4 years ago

Just had a chat with some people at hypothes.is, a small startup using an open-source platform for annotating anything with a URL, and some PDF annotation features. It seems that this might provide a very nice option for integration within CoCalc.

mforbes commented 4 years ago

I tried using hypothes.is with the CoCalc PDF readers (using their Chrome extension, but something about the PDF reader breaks this completely. The Canvas and SVG options do not allow you to select, so there is nowhere for hypothe.is to anchor. Switching to the Native viewer allows one to select text, but hypothes.is is inactive until you toggle it off and on, at which point the PDF tries to reload, failing with an "Invalid or corrupted PDF file." error.

CoCalcPDFAnnotation

It seems like somehow the hypothes.is extension tries to do something with the PDF file first? I know that it somehow looks for some sort of tag in the PDF file - this is apparently how it stores and organizes the annotations (which are usually organized via URL) but something about CoCalc's presentation of the PDF is confusing it, and vice versa.

slel commented 3 years ago

There is now an official released version of nb

by the nb project from the Haystack team at CSAIL in MIT.

Its roll out at https://nb2.csail.mit.edu was recently announced:

mforbes commented 2 years ago

Just brainstorming:

Annotating PDF files in the LaTeX editor might be a unique "killer" feature over tools like Overleaf. Will also be a very stringent test on any technology to keep such annotations in the right spot as the underlying document changes.

This will also completely fail if the annotations are embeded in the PDF, which gets replaced, so a feature like this would require storing annotations outside the file (as well as embeding them?) The Hypothe.is model stores all annotations externally. I can see advantages with both methods. The Skim app on Mac OS X offers both (might be worth looking at the format they use).

williamstein commented 2 years ago

Will also be a very stringent test on any technology to keep such annotations in the right spot as the underlying document changes.

I continue to like your suggestion to just declare what that technology does, and when something gets lost, to explicitly put it in a "lost notes" location. The fun part is to then use TimeTravel and see how good we can get at not losing things. Storing annotations as metadata in another file is interesting since it is potentially much more general.

williamstein commented 1 month ago

Important update: I added support for the text layer to pdf with pdfjs. I did this to enable text selection, but it's also critical for annotation support.