hypothesis / product-backlog

Where new feature ideas and current bugs for the Hypothesis product live
117 stars 7 forks source link

Export annotations #566

Closed klemay closed 4 months ago

klemay commented 6 years ago

Problem:

Users frequenlty request the ability to export annotations from the UI. @judell's utility at http://jonudell.net/h/facet.html allows for this, but it's not an official part of the product. I'm creating this issue because #394 in the h repository was closed and I can't find an open issue here.

User requests

(note: this contains requests from time of issue creation onwards - still need to go back and get previous requests to reflect how often this comes in)

User Story

As a user, I would like the ability to export my annotations, from whatever view I'm seeing them on the user and group profile pages, or from within the current view of the sidebar, so that I can use my annotation data outside of the Hypothesis application. Use cases for annotation data include, but are not limited to:

Acceptance Criteria

lyzadanger commented 6 years ago

I'd like to voice my strong support for this feature. I think it is very important.

segdeha commented 6 years ago

Owning your data (i.e. being able to export and delete your data) is a central tenet of GDPR. It's not on the short list for May 25 because we technically have a way (via the API) for users to export their data, but we do need to build a feature that makes it simple for non-developers.

klemay commented 6 years ago

we have a request in the Chrome web store to export annotations in context, i.e., along with the source text. This could technically be done with screenshots, but worth logging as a feature request (and a use case not covered by Jon's utility).

It would be great if you could export the annotations with the page to a pdf file.

ajpeddakotla commented 5 years ago
Business Viability User Engagement Building Community Level of Effort
2 3 1 2
ajpeddakotla commented 5 years ago

Issue updated - original comment has been updated to add user story and acceptance criteria @dwolfe @dwhly

klemay commented 5 years ago

To see the questions/requests that have come in before, see https://hypothesis.zendesk.com/agent/admin/tags

Click on "export" and you'll see current & past issues about exporting.

Note: when you follow the link above, you will be prompted to sign in with Google. Use zendesk@hypothes.is for the email address. The password is in Passpack - ask Dan for access if you can’t see it.

dwolfe commented 5 years ago

Proposed design for first iteration of export: screen shot 2018-10-16 at 7 35 05 am

This button would be to the right of the search results title ("955 Matching Annotations"). We may elect to move/change this in the near future, but this is a functional, discoverable solution for now, which will allow us to write the backend code to support export/download.

lyzadanger commented 5 years ago

@dwolfe This is a nice and consistent approach overall. Like! Some quick thoughts!

jeremydean commented 5 years ago

Looks good.

Are we at this point not yet talking about what formats folks would be able to use? If we are then I'd say there needs to be a plaint text or HTML option. Most edu users I'd venture to guess do not know what JSON or CSV are.

A big question here (though perhaps for later implementation) is what does the user actually wants to see when they download. The full standard data package associated with an annotation, time stamps, etc. is probably NOT what most edu (and I'd guess scholarly) users would want. In most cases I'd guess they would just want to see all referents and annotations. And if they are looking across users then probably creator too. If and when it's helpful, I'm happy to write up some user stories.

Also, I'd just say "Download."

lyzadanger commented 5 years ago

Also, I'd just say "Download."

At first glance, that seems sensible to me, too. But I'd love to see this button in context to see whether the extra wording helps.

A big question here (though perhaps for later implementation) is what does the user actually wants to see when they download.

While we do need to be spec-compliant with the JSON-LD download format, thanks for reminding us that we should be thoughtful when designing the CSV/Excel-type output. I do think the spec can serve as a guide/starting point (the fields that make sense, minus the arcane ones, of course) that we can build on—keeping as much continuity as is reasonable between the formats will make the stability and the quality of the export feature stronger, IMO.

dwolfe commented 5 years ago

@lyzadanger I have an assumption about this feature in general, which is that users generally won't download a set of search results without a pre-existing reason, and that they'll have a format in mind before they do. That might lessen the need to explain the formats, but I'm a fan of being explicit where we can, so we can definitely explore adding a short description beneath the format titles.

@jeremydean I'd default to providing the full data package and letting the user slice/dice as needed. Again, I feel like folks will have some idea of how they're going to use the data before they decide to download, won't they? Although, I can see the logic in removing any data element that isn't surfaced publicly at some point.

RE: "Download" - We have a lot of room on the right side of the header here, and adding "these results" answers the question "Download what?" that users will have when they first see it. In future iterations it might make sense to remove it, but for launching the feature I think it's better to be explicit.

jeremydean commented 5 years ago

Again, I feel like folks will have some idea of how they're going to use the data before they decide to download, won't they?

Yes, but if this button is to appear on activity pages, what you see there doesn't exactly capture what the package will contain, both literally and from the user's eye in the UI. But I'm down to wait and see. A v1 here will be huge both for our mission and for users.

letting the user slice/dice as needed.

This assumes they know how to do this if we don't give them an interface to do so. But again, I'm down to wait and see.

Are there user stories written up for this feature somewhere @ajpeddakotla?

dwolfe commented 5 years ago

@ajpeddakotla added user stories and acceptance criteria in the original issue description above, but that was a recent edit.

I forgot one comment before: I only have two formats shown because that's the minimum number to justify a dropdown/select menu. My assumption is that we'll add more formats (e.g. HTML and/or PDF, I know those have been mentioned) after we establish the interaction necessary to access them.

dwhly commented 5 years ago

My assumption is that we'll add more formats (e.g. HTML and/or PDF, I know those have been mentioned) after we establish the interaction necessary to access them.

Not quite sure whether you mean "as we're designing this feature, and before we finish it" or "at some point after we ship v1". Strong vote for option a) vs b). Definitely agree that being able to simply copy and paste out of HTML or TXT is a key 3rd option here. Vote would probably be for HTML b/c we could render external images and video there.

lyzadanger commented 5 years ago

Not quite sure whether you mean "as we're designing this feature, and before we finish it" or "at some point after we ship v1".

I do want to make sure we keep the definition of this feature in chunkable iterations, however. I feel a tad overwhelmed at all of the possible output formats. JSON-LD and CSV/Excel, based heavily on the standard? That seems like a containable, chunkable task.

Put another way, I wouldn't want to hold back shipping the core formats because we're wallowing in PDF generation and HTML rendering :). Not that we shouldn't ever do that. I feel like the risk of launching with too few formats is lower than trying to attack every format at once.

(Just my humble $.02).

dwolfe commented 5 years ago

@dwhly: I did actually mean b), in the agile sense of shipping a useful feature that we expect to augment later, based on user feedback (which we'll get by shipping). I know we have existing requests for different formats, but shipping a subset of everything we know users want will (theoretically) be quicker than developing the full set, and also let users weigh in on things like UX, which data elements we should include/exclude, etc.

dwhly commented 5 years ago

Over to @ajpeddakotla.
(Though would definitely agree PDF shouldn't be on the MVP list regardless.)

The user story for HTML btw is "I want to be able to easily cut and paste a set of annotations into a document I'm working on, or an email, etc." We hear this a ton.

klemay commented 5 years ago

Re: "Download" vs "Download these results" - just want to remind us about screenreader users. "Download search results" or something more specific would be preferable. Agreed with @lyzadanger and @dwolfe that we probably don't want to get too caught up in this until we see it in context though!

judell commented 5 years ago

based on user feedback (which we'll get by shipping)

We can also, of course, refine our user stories by checking in with lead users of the standalone exporter.

jeremydean commented 5 years ago

Totally agree with let's ship and get feedback and iterate.

My comments weren't meant to slow that process just offer some thoughts based on my experience.

judell commented 5 years ago

I believe JSON should be JSON-LD

Note that JSON-LD is currently a one-at-a-time deal. https://github.com/hypothesis/h/compare/search-as-jsonld intended to apply the JSON presenter to all search results.

bijang commented 5 years ago

I would like to show support for this idea, my personal workflow would be zotero extension zotfile to extract highlight and annotations (for me to plain text in emacs, alternatively docx/odt etc)

dwhly commented 5 years ago

I've added additional criteria above to support export of the current view from within the sidebar, in addition to activity pages.

It might not be the first thing we ship, but we shouldn't forget that many people will want to create a bunch of annotations and export them immediately. Or be wanting to export annotations while looking at a page, and not have to recreate the same view in activity pages in order to accomplish the task.

dwhly commented 5 years ago

Another request for text output today (HTML or markdown):

"What I'd ideally be able to do is get some Markdown for a given web document with the title of and link to the document and a list of each appropriately formatted highlight or comment with a Hypothesis link to each highlight/comment in context (happy to settle for HTML)."

ajpeddakotla commented 5 years ago

Something I came across perusing the web: Data Transfer Project - a data transfer service we could integrate with.

Website: https://datatransferproject.dev/ Github: https://github.com/google/data-transfer-project/blob/master/Documentation/Integration.md

tmabraham commented 4 years ago

Hello,

Is there any progress on this? I would love to be able to export PDF annotations (described here)!

klemay commented 4 years ago

@tmabraham no progress yet, but we are planning to work on this soon!

tmabraham commented 3 years ago

@klemay I see you have removed this from the hypothesis public roadmap... Is this a feature that isn't being worked on anymore?

klemay commented 3 years ago

Hi Tanishq - this feature will be part of a larger project we’re working on rather than its own project

On Sat, Aug 8, 2020 at 6:15 PM Tanishq Abraham notifications@github.com wrote:

@klemay https://github.com/klemay I see you have removed this from the hypothesis public roadmap... Is this a feature that isn't being worked on anymore?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/hypothesis/product-backlog/issues/566#issuecomment-670979467, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4HDGMGBMIVBQSE4GXNHCTR7XFALANCNFSM4EX27LJA .

-- Katelyn Lemay (pronouns: she/her https://www.mypronouns.org/she-her) Product Manager, Hypothesis https://web.hypothes.is/ Black Lives Matter https://blacklivesmatters.carrd.co/.

tmabraham commented 3 years ago

@klemay What do you mean by "larger project"? Is there a separate project from the Hypothesis annotation client?

klemay commented 3 years ago

@tmabraham By "project" I mean "a new set of features that we are developing as one larger piece of work" - sorry for the confusion there!

Jackiexiao commented 3 years ago

I hope this feature could be implemented soon! It would be wonderful if it looks like this image

kitaev-chen commented 3 years ago

it's too slow... I saw other product has this export function now only lack of pdf annotation function https://pagenote.cn/

thelazyoxymoron commented 3 years ago

Although this would be just another user's request, this feature would be a huge boost to my workflow (and I'm sure for countless others as well). I routinely highlight any interesting thing I read over the web (articles,books,pdfs etc) and need to go through them later. Having a pdf/csv export of all my annotations would be extremely useful.

Can we expect any deadline/projected timeframe for this feature? Thanks!

klemay commented 3 years ago

@thelazyoxymoron no projected timeframe at the moment. In the meantime you can get a csv export of your annotations using this tool: https://jonudell.info/h/facet/

kitaev-chen commented 3 years ago

The best format to export I guess is markdown file

NightMachinery commented 2 years ago

Adding a link to the exports that opens the webpage at the highlighted place (context) would be great. This should be possible using Chrome's text fragment links, ala this:

https://en.wikipedia.org/wiki/Field_(mathematics)#:~:text=a%20field%20has%20two%20operations%2C%20called%20addition%20and%20multiplication%3B%20it%20is%20an%20abelian%20group%20under%20addition%20with%200%20as%20the%20additive%20identity%3B%20the%20nonzero%20elements%20are%20an%20abelian%20group%20under%20multiplication%20with%201%20as%20the%20multiplicative%20identity%3B%20and%20multiplication%20distributes%20over%20addition.
betamigo98 commented 2 years ago

Here a template-based solution coming from memex and obsidian, proposed in this reddit post.

Re-taken as-is from the post, it looks like the block below. I am new to github, please forgive me if I did something wrong. Thank you

[{{{PageTitle}}}]({{{PageUrl}}})

{{#Notes}} =={{{NoteHighlight}}}== *{{NoteText}}*

{{/Notes}}

milahu commented 4 months ago

bash script to download all annotations for a user

hypothesis-annotations-scraper

robertknight commented 4 months ago

An initial import/export feature has been added to the Hypothesis client. See https://web.hypothes.is/help/exporting-and-importing-annotations/ for details. This doesn't include every sub-feature from the original issue, which we should have scoped more narrowly. Please open new issues for specific improvements.