documentcloud / documentcloud

The DocumentCloud platform
https://www.documentcloud.org
MIT License
424 stars 162 forks source link

Publish note on private doc #136

Open hobbes7878 opened 9 years ago

hobbes7878 commented 9 years ago

In cases where we have a rolling series of stories that draw on parts of a single doc for each installment, I think it would make sense to allow embed of a public note on a private doc but restrict users from clicking back through to the original until the doc is made public in the final install.

jashkenas commented 9 years ago

That's a really interesting concept, and use case...

But I think that it over-complicates the more-perfect solution: Having a different (perhaps edited and excerpted) doc for each installment of the story.

Having a public note embedded from a private doc is not ideal — because DocumentCloud doesn't have any way of knowing what's public, and what's private within the doc. The images for just the pages that happen to have embedded notes would have to be made public on S3 ... but perhaps the rest of the page still needs to be private — so leaking the surrounding page image would be a mistake.

Ultimately, it would make it too confusing and too easy for folks to accidentally leak portions of private docs — let's not go there.

hobbes7878 commented 9 years ago

The less than perfect part of the multi-doc solution is if a reporter is putting copious notes in the doc for each installment, ie, aggregating up to the last story, then those would have to be transferred between each iteration of the doc until the last, when we have the complete story.

But I get the fuzziness around dealing with parts of a page image..

reefdog commented 9 years ago

This really is fascinating.

@hobbes7878 Is this about actually securing the source document (i.e., you truly don't want readers seeing the entire source doc, but do want them seeing selections via notes)? Or is it about more advanced workflow and publishing options? Because I can envision the usefulness of something like note collections with their own access controls, as well as more granular access controls like "private to organization" on a public doc.

Wondering if things like that would actually be more useful to you than public notes on private docs, which as @jashkenas says would be really tricky.

knowtheory commented 9 years ago

One possible alternative is actually excerpting the highlighted region for a note and keeping a separate asset for it. Text is a little complicated for the time being, but if we begin to track bounding boxes for text, that also may be possible. Essentially breaking notes out as real objects a little more independent from their parent document/page

hobbes7878 commented 9 years ago

In practice, it becomes more of a workflow issue for reporters, I guess. Problem is replicating the work they do marking up a doc across several copies truncated for each published installment. Also breaks up the utility of DC as a unified platform for investigating a doc as you go.

I like @knowtheory's idea of notes as first class objects, because we do treat them as pretty autonomous content. Note collections sound great.

I suppose there is a security aspect, but more like you suggest, releasing notes in tranches rather than absolutely protecting parts of docs from public view (which s/b a redaction).

anthonydb commented 9 years ago

Notes independent of documents would be an interesting feature, although we'd need to provide a way for users to edit them in batches (e.g. make all these notes public at once, or add a bit of text to this dozen en masse). I think this request ultimately comes down to storytelling preference. In some newsrooms, the prevailing wisdom is to open the whole document up to the reader at the beginning of the series rather than reveal pieces as you go. I would recommend @jashkenas solution as a solid workflow in the meantime.