Continuity of E-Library

Okan-Ozcelik commented 1 year ago

Classic books have been reprinted and reprinted for centuries. The paper books on the shelves wear out and disappear. But the content of that book remains permanent because it is printed over and over again. What matters is the content. Users change their computers every 3-5 years. But they keep the personal content they have created. They carry their documents and music to their new computers. They are permanent. It is continuous. E-Book Readers get old like paper books on the shelf. But their content needs to be preserved. Readers may want or even have to change their devices every now and then. And of course, they will want to move the content their e-books to the new device. They use Adobe Digital Editions to send the books to the new device.

W3C is developing EPUB, the universal book format. EPUB becomes the e-book standard. So it can be read on any e-book reader that supports that standard. Now that it has become a standard, it brings to mind MP3s, which can be listened to on almost every computer, music player. But perhaps something is missing.

What makes a book valuable are the parts that the reader likes. They highlight their favorite texts. Maybe they take notes on the page. With effort, they create their personal content. Of course, they will hope for the permanence of this content. New books are archived in the library after they have been read. But most of the books in the library will never be reread from beginning to end. Instead, only notes taken and texts highlighted are reviewed. Personal libraries last for decades. But devices, unfortunately, only last a few years. EBooks are superior because they don't wear out, because their data never disappears. But does the e-library really never disappear when it needs to be moved to a new e-reader!

When a note is made on a book, the note is not actually saved in the book. It is saved in a note file linked to the book. Later, when the book is opened, the relevant parts of the note file are parsed and displayed on the book again. But different devices have different ways of saving notes. There is no harmony. E-books can be moved to the new device, but notes cannot be transferred to the new device. On the new device, the book is as if the cover has never been opened. All the notes taken are gone!

For text highlighting to be more than a momentary technological entertainment, it must be permanent. It would be a waste of time to underline an important text while reading it if it cannot be permanent.

This is why notes can also be saved in a universal format. It can be a note file with the same name in the same folder as the book. W3C could set standards for saving notes. Perhaps this could be the EPUBNotes file type.

The device settings can now offer the following options: Notes can be saved according to the device's own note standard, or they can be saved according to the EPUB note standard. If the reader chooses to save according to the EPUB standard, the notes will be saved according to the rules set by W3C.

Imagine a future reader replacing his e-book reader: He sends the old EPUB books he bought to the device. But the device recognizes these books as new. There are no more notes. But when the reader copies the note files of the books from the old device to the new device will he get his notes back. Any device that supports the universal EPUB format will also support these note files. EBook reader devices are ephemeral. The e-library and the personal content that the reader creates in the book must be sustained.

In fact, a perfect solution would be for W3C to develop a standard for the e-reader to save book notes on top of the Epub file. Notes taken on PDF files are saved in the PDF file. This ensures the permanence of the notes. This is how the notes for Epub books should be! Perhaps WC3 could collaborate with Adobe on this.

Publishers will be pleased to see the design of the e-book improved. But why should they care about the quality of the reader generated content? Why should they ask W3C to improve that too? The reader gets more out of the book with the content they create. It is the reader's favorite parts that make the book valuable. They underline their favorite texts. They remember more parts of the book. So he can talk about the book at length to his friends. He can often bring up different parts of it. The reader will, of course, be advertising the book! The more the reader remembers about the book, the longer he/she will keep it on the agenda. Some of their friends will want to own the book. Then they will advertise it to their friends. The process starts to work. Now we can hope that the sales of that book will increase. It will make the publisher of the book happy.

In fact, this also makes it easier for e-reader manufacturers. They don't have to design new software algorithms for taking notes on the book. The standard rules are already in place. Manufacturers just need to write the appropriate program.

P5music commented 1 year ago

Hello Okan-Ozcelik

Many years ago I created an ePub reader that does what you seem to be wanting.

It can annotate the ePub with text decorations (as underlining and others) or short text notes or even documents of any format. The reader put special SVG drawings as icons on the exact character or on a position on an image. The notes are saved separately from the ePub. They can also be hyper-links. The folder containing all those notes can in theory stay in the cloud. The app is based on Google Android SAF, than it should work on the device as on the cloud, but this framework was not fulfilled by Google itself as to some functions. I mean the intent-based version of the Google API or SDK. There are also official APIs but they are difficult to use by a developer. The SAF is simpler and could be easily adopted as a new sort of filesystem in my app with minor changes. Nobody talks of this issue, but it just does not work to create a folder on the cloud and access as a filesystem. This was the promise, but Google led the developers to adopt SAF, then SAF locked the apps in the device instead of opening to the cloud. Official issues on the issue-tracker were dismissed. I asked also on the Next-cloud forum but nobody answers. I do not think this is my fault because also other providers do not implement that feature. I hope I am wrong but this is my experience. My app is able to sync the notes on different devices.

If you are curious the app is at: https://play.google.com/store/apps/details?id=com.studycomfort.app&hl=en

It also has other advances features like diacritic text search and an advanced navigation system.

But please take a look at this post

https://github.com/w3c/publishingcg/issues/24

where a particular problem is addressed.

The annotation positioning is sort of "proprietary" in my app so it is not standard. I tried to migrate to the epubcfi standard but one library I found to handle did not work for me, if I remember correctly it was only half implemented because I need to retrieve the position. However the epubcfi method seems to be not reliable because ebooks can change, even for minor HTML changes that break the method currently implemented in my app but also for the epubcfi positionings I think.

What do you think? Regards

P5music commented 1 year ago

Ah Ah now I realize it is probably from ChatGpt

Okan-Ozcelik commented 1 year ago

Thanks for your interest in the subject. Yes, I hope that W3C will be able to develop a solution like Open Annotation. Otherwise, it might encourage readers to convert their books to PDF.

P5music commented 1 year ago

@Okan-Ozcelik I read those Open Annotation proposal and draft documentation. Unfortunately I think it is not envisioning what has to be in the future or solving the issues (like the one I pointed out), what I see is just the usual bureaucratic-technical method used and strange choices are made for data, attributes, schemes and so on. Maybe I am wrong and I remember badly but it could be worse for annotations on ePubs to have that standard than better. I do not know, maybe I am just talking about an old draft. Should I be interested in it again? Regards

Okan-Ozcelik commented 1 year ago

@P5music You don't need to work on developing a solution. W3C needs to deal with the issue. It is necessary to convey the issue to W3C. I wrote to W3C's e-mail list. But I'm not sure if it actually reached them.

TzviyaSiegman commented 1 year ago

This issue is puzzling to me. @P5music Who did you write to? There are hundreds of W3C mailing lists. What do annotations have to do with e-libraries? What does chatGPT have to do with any of this? Please do not confuse writing a specification with adoption of standards. Is this issue about the ability to take notes in ebooks? share notes across platforms?

P5music commented 1 year ago

@TzviyaSiegman Sorry for the misunderstanding. ChatGpt is not relevant here. For the other issues, my post was clear enough I think. I do not understand your reply. I do not mind W3C mailing lists at all. I did not write to anyone, I just dismissed the Open Annotation stuff, but maybe I am wrong. Regards

Okan-Ozcelik commented 1 year ago

@TzviyaSiegman Annotations include each notes the reader takes on the book. Highlighting the text, for example, is one of them. The reader thus personalizes their e-library. Like notes on paper books, the reader hopes that the notes they take on their e-books will be permanent. So annotations are an integral part of the e-library.

wareid commented 1 year ago

Hi @Okan-Ozcelik I think I understand better now. But let me try to summarize and you can confirm.

Annotations are just as important to the text for a user as the text itself. You are looking for a way we can clearly attach annotations to an ebook that persists. I assume it would also be interoperable across platforms (like if I annotate my book on Apple Books, and then open the file in Thorium, my highlights and notes carry across).

Am I understanding this correctly?

AudreyLBE commented 1 year ago

@wareid this is how I have understood the issue and I will add my vote for the importance of preserving notes. Currently I go to a lot of trouble to save my notes by exporting them to Calibre and making backups of the database from my reader, but they are dissociated from the ebook file and therefore of limited usefulness, and this method is dependent on a plugin developed specifically for one brand of reader so not universal.

I recently had to repair the corrupted database of my Kobo, by replacing it with a previous backup. I made sure to restore my annotations as well, however some of them were lost (even though only the database was corrupted, none of my ebooks were damaged). This process was fastidious and probably not accessible to many people as it involved using SQL database editors to manipulate specific tables of the database before copying it back to the reader, which I had to learn to do specifically for this issue.

Another problem when attempting to save notes manually / independently of the book file is that text encoding can be improperly preserved, so that special characters are not correctly displayed.

Example

Saved note (highlighted text): Une seule chose est sÃ»re, toutes ces maisons nâ€Ǧexisteront plus, aussi mes efforts sont-ils infimes, ils peuvent tenir sur une tÃȞte dâ€ǦÃ©pingle, tout comme ma vie. Et Ã§a, il ne faut jamais lâ€Ǧoublier.

Original text of this highlight: Une seule chose est sûre, toutes ces maisons n’existeront plus, aussi mes efforts sont-ils infimes, ils peuvent tenir sur une tête d’épingle, tout comme ma vie. Et ça, il ne faut jamais l’oublier.

A standardised annotation system, integrated into the epub file, would have prevented all of these problems and would allow me to keep those notes for use on future ebook readers even if the Calibre plugin is no longer developed, or the database of my reader becomes corrupted again or my reader is otherwise damaged or lost, or Kobo modifies their software or the proprietary method of annotations currently in use, or I switch to a different kind of reader.

dalerrogers commented 1 year ago

Agreed. If you can add meta information to a JPG file, it should be easy to include an XML file in the manifest for notes.

Best Regards,

Dale

Dale R Rogers, M.Ed, CIW Digital Creative Entrepreneur Instructional Designer, eLearning Developer Personal: @.**@.> Web: dalerogers.mehttps://dalerogers.me/

From: AudreyLBE @.> Date: Friday, February 17, 2023 at 9:34 AM To: w3c/publishingcg @.> Cc: Subscribed @.***> Subject: Re: [w3c/publishingcg] Continuity of E-Library (Issue #55)

@wareidhttps://github.com/wareid this is how I have understood the issue and I will add my vote for the importance of preserving notes. Currently I go to a lot of trouble to save my notes by exporting them to Calibre and making backups of the database from my reader, but they are dissociated from the ebook file and therefore of limited usefulness, and this method is dependent on a plugin developed specifically for one brand of reader so not universal.

I recently had to repair the corrupted database of my Kobo, by replacing it with a previous backup. I made sure to restore my annotations as well, however some of them were lost (even though only the database was corrupted, none of my ebooks were damaged). This process was fastidious and probably not accessible to many people as it involved using SQL database editors to manipulate specific tables of the database before copying it back to the reader, which I had to learn to do specifically for this issue.

Another problem when attempting to save notes manually / independently of the book file is that text encoding can be improperly preserved, so that special characters are not correctly displayed.

Example

Saved note (highlighted text): Une seule chose est sÃ»re, toutes ces maisons nâ€Ǧexisteront plus, aussi mes efforts sont-ils infimes, ils peuvent tenir sur une tÃȞte dâ€ǦÃ©pingle, tout comme ma vie. Et Ã§a, il ne faut jamais lâ€Ǧoublier.

Original text of this highlight: Une seule chose est sûre, toutes ces maisons n’existeront plus, aussi mes efforts sont-ils infimes, ils peuvent tenir sur une tête d’épingle, tout comme ma vie. Et ça, il ne faut jamais l’oublier.

A standardised annotation system, integrated into the epub file, would have prevented all of these problems and would allow me to keep those notes for use on future ebook readers even if the Calibre plugin is no longer developed, or the database of my reader becomes corrupted again or my reader is otherwise damaged or lost, or I switch to a different kind of reader.

— Reply to this email directly, view it on GitHubhttps://github.com/w3c/publishingcg/issues/55#issuecomment-1434813963, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOCKEWQJSRY4XX7SR4CE7LWX6LA3ANCNFSM6AAAAAAT2SKBME. You are receiving this because you are subscribed to this thread.Message ID: @.***>

iherman commented 1 year ago

@Okan-Ozcelik I couldn't agree more: such annotation/note system would be great to have. Let me add one more thing: you describe a set of use cases around e-books but, in fact, those use cases may also be valid for the Web in general (I know that many pages on the Web are ephemeral, unlike books, but a large percentage of content are just as stable as books). The relationship between e-books and the Web is all the more strong that the underlying technology is identical; an e-book in EPUB format is, with a rough approximation, a Website in a package.

However. Your description refers to W3C as being the organization that can achieve that. The good/bad news is that W3C has already done what it could do in this respect; indeed, consider these three specifications:

Web Annotation Data Model
Describes the underlying Annotation Abstract Data Model as well as a JSON-LD serialization
Web Annotation Vocabulary
The Vocabulary which underpins the Web Annotation Data Model
Web Annotation Protocol
The HTTP API for publishing, syndicating, and distributing Web Annotations

These specifications exist. They may not be perfect and may require improvements, no doubt about that, but they have the merit to exist. However, they have not been adopted, neither by browser vendors nor EPUB readers. The closest adoption is the system offered for Hypothes.is; it is an extension that works with the major browsers on Web sites, and you can share Web page annotations (although, last I checked, you cannot set up your own server to store those annotations), but the major EPUB readers have not adopted it. There may be some smaller readers that may have incorporated Hypothes.is (with the aforementioned restriction), but what would be a game changer is for Apple, Google, Amazon, and the other big ones, as well as browser vendors (Chrome, Firefox, etc.) to pick the specification up and implement them natively. That is where the influence of W3C ends, though: it can shepherd groups to develop specifications, but it cannot force any company to implement those.

Okan-Ozcelik commented 1 year ago

Hi @wareid, Every device of different brands that can open Epub files should be able to open the same notes on the Epub files. This is the case with PDF, for example. Adobe Acrobat, Xodo etc. can show the same notes on PDF. W3C can develop a system for saving notes on Epub. This is the summary of the issue.

P5music commented 1 year ago

@Okan-Ozcelik

Saving the annotations in the ePub itself is not good because it would work just for minor annotations, while sometimes, for example, the work of a scholar could be to comment a huge book of thousands of pages (translate it in ePub length), producing lot of text or even attaching lot of documents.

Okan-Ozcelik commented 1 year ago

@P5music

You are partly right. Still, the issue I'm talking about is notes that are already as advanced as notes that can be taken on PDF.

P5music commented 1 year ago

@Okan-Ozcelik

Ok, Now I understand, but please consider that PDF is sort of the digital version of a printed book, so you are limited to the kind of annotations you can put on pages of a book. More refined annotation systems include attachments too but all goes inside a new PDF (copy) that is bloated or inside the same original document that is saved again on disk. If that is good for you, there is the idea of creating a copy of the ePub book that is the annotated one, while having the original ePub untouched (this is important for me). The "cloned" ePub would have its XHTML modified at your pleasure by your annotating work, whatever it consists of, so that the notes are like the notes of the publication, or they appear differently. The ways of doing so are the same number of the ways annotations can be organized inside a ePub publication itself. Some ebooks could have hyperlinked separated pages or resources. Other ebooks could have pop-up notes (like the recent Amazon e-ink device does now) or whatever system. Consider also that a modern way would be just hyper-linking to cloud notes or resources. But those ones would be not inside the ePub, they just would be linked to from inside the ePub XHTML, if we are sticking to the "cloned" ePub with annotations metaphor. But that would require a different system for underlining, marking and so on.

Then a major issue comes to my mind: how would you undo the modifications to the ePub XHTML? Is the reader keeping track of the annotations to remove them, or are they become part of the book? These are the concerns I had in mind in creating my app.

wareid commented 1 year ago

I think the more sustainable and likely option would be generating an accompanying annotations file that can live alongside the EPUB file and that reading systems can ingest to display annotations.

The web annotations framework mentioned by Ivan is all we really need, but the challenge comes in implementation. Reading systems would need to both generate and ingest a web annotations file, which is not something we've seen in practice yet. We'd also need an agreed-upon and universally supported location method like CFI.

We've long taken the approach in this industry of never modifying the source file, especially not a version that could be pulled out of the system it's living in (obviously we all know individual platforms do modify files to ensure optimal rendering on their platform, but those modifications never travel outside of the platform). An accompanying file is the best solution.

P5music commented 1 year ago

@wareid I would like to emphasize the fact that such an external file would be very difficult to handle in case of huge contributions in the form of written annotations, or if attachments are involved or allowed. Even if we restrict our thoughts to simple text-only notes, the resulting file could be huge nevertheless, if it annotations are done in a professional context like the academia for example. If the idea of a folder (instead of a single file) is deemed not suitable, you have to consider that nowadays much of this kind of documents are on the cloud. Cloud documents are often in a generic form that encompasses the document and the folder form, so that a folder is also a document and a document can be the master node of a tree or a folder, and so on. So the folder seem to be the suitable basic form of a document.

As to ePub CFI, it is interesting because it allows to have ranges of text to be annotated, or underlined and so on. And also images can be annotated with x-y coordinates, if I am not wrong. Moreover, if my memory does not fail me, the ePub CFI has an useful path prefix rule so that also other ePub publications can be cited or linked (although not necessarily in an automatic way).

The ePub values would be useful to have a reference to the exact point of an ePub, but still the issue remains of what happens when the HTML changes, even for minor modifications that break the DOM tree (that is: the tree is now different, maybe from the beginning). This is important not only for minor changes that publishers can do to the ebook for fixing bugs or errata-corriges, but also for subsequent editions.

iherman commented 1 year ago

Though technically feasible, I do not see the option of extending the EPUB file coming to the fore in implementations, at least in the way the current EPUB implementation strategies work. I think the main obstacle is what @wareid noted in https://github.com/w3c/publishingcg/issues/55#issuecomment-1438898899: implementations frown upon modifying the EPUB file itself to add content. There are technical issues (e.g., today, each user has his/her own "copy" of the EPUB file on a local storage, how would the update of these files work to get everyone's annotation added to the local copies?) and non-technical (e.g., who would have the copyright on the annotations, how would that be handled alongside the rights of a specific publication?).

The beauty of the aforementioned W3C Web Annotation Framework is that these issues do not arise. There is no modification of the original text (Website or EPUB instance) whatsoever. Everything is stored in an annotation server; each annotation stored by the server has a reference (via a URL and a description of where the annotation resides within the referenced content) and the annotation itself. There is a standard protocol to communicate with the server to get/modify annotations. An annotation server can be private, can be managed by a university or indeed a publisher for its own publications, and the "only" thing a browser or an EPUB reading system ought to do is to communicate with a server (possibly of the reader's choice) to get, display, and create annotations.

As I mentioned before, the W3C did "its job", the problem is that the non-trivial implementation work has not materialized. The reason, in my view, is mostly non-technical: what is the business incentive to get EPUB Reading Systems (and browsers) to implement such an annotation system? I am not an expert of business, I do not know which communities could create a strong enough pressure to provide such an incentive. But I believe that is really the core of the issue. The rest is doable, the possible deficiencies in the standards are solvable (and I believe W3C would be more than happy to pick up the standardization work for that), but the pressure should come from the implementers.

There is a certain analogy with the recent story around activity streams. There are similarities between the respective standard families around activity streams and annotations (actually, there were contacts between the groups, so this is not entirely by coincidence) and activity streams had a very low implementation and usage level for many years. It needed the recent turmoil around Twitter to suddenly bring Mastodon to the fore, and now everybody talks about activity streams, web mentions, etc. We may need some similar storm around annotations...

P5music commented 1 year ago

@iherman I would like to add some considerations, in the spirit of helping.

Being that many considerations have been done along this thread, pointing out different aspects that a single person maybe is not aware of, or maybe is not recalling at the time ideas are freely expressed, it seems that leaving the different parties to deal with how to manage annotations seems to be the best option.

This is because many technologies have been introduced and the users showed to like them, as for example collaborative annotating process and other workflows. New ways of working and annotating could be also in use very soon.

But the ePub should correct what is its original flaw, in regard to annotations IMHO, that is the HTML DOM.

This technology is a huge advantage until you realize that PDFs have pages, that can bear many modifications without breaking that structure, while the HTML structure is easily broken even with a single tiny element added or removed. So there is the issue of the HTML code potentially changing and references becoming stale or wrong. That is not remote at all, being that editors could want such an easy way to do errata-corriges nowadays (take in account that nowadays the shipping process of anything resembling a piece of software, like an ePub appears to be, is way more relaxed than in the past, being that the update process give the ability to fix errors).

We also know that many documents can stay long in the draft form, still someone could need or decide to annotate them.

So I have a proposal. The ePub standard should be enriched by a mean of informing the reader that something could have changed in the HTML, or just providing a file with "diff" changes so that the annotations can be processed to be working and pointing to the right places. I mean an official file with an official place in the folder structure, like the important files that every RS know have to be read to be compliant with the newest ePub standard (opf, ncx, nav, container an so on).

Of course before processing the annotations reference a backup would be adviced, and only the user or admin could decide to process the annotations in a certain community, based on the fact that a new ePub version has been "released".

So when an ePub is purchased or downloaded it could already contain a "diff" information file, that provides the change history, a sort of standard and structured change log. Then if the user has not annotations yet the references will be based on the current version because there is no need to process or update anything, while if the RS compares the version of the ePub the present annotations refer to, with the current available or used one, it can inform the user of the need of processing the references, provided that the user agrees with using a modified version. Indeed the user could also refuse because it realizes that a new version is not wanted.

I do not know the W3C Web Annotations standard (not studied yet) and I do not know if it has anything to do with ePub CFI, but when a standard would be chosen and any point of text or image (or video timestamp, and so on) could be annotated precisely then also the diff file should be introduced, at least for ePubs, that is what we are dealing with here.

What do you all think? Regards

iherman commented 1 year ago

@P5music,

Let me repeat one thing: there is no clear business interest among publishers and/or reading systems to offer an open annotation system that we are discussing here. Until such an interest is clear, all discussions remain purely academic. And W3C is not an academic institution; it standardizes industry practice, or close to industry practice, rather than inventing its own solutions.

It is of course perfectly fine to discuss these things in a W3C Community Group like this one; this is where some level of incubation could and should happen. That is exactly what a CG is for. But in case such a discussion aims at an ulterior standardization, the constraints must also be clear.

(And, to be clear, I am first to deplore the situation!)

A few comments on your comments:

the ePub should correct what is its original flaw, in regard to annotations IMHO, that is the HTML DOM

I am not sure what you mean by original flaw, but I do not see it happening here. The force of EPUB is that it relies on HTML and friends, meaning that we do not have to reinvent any wheel (neither on paper nor in practice; an EPUB reading system these days relies on browser engines for rendering). HTML is used on billions of pages and used by most of the people around the world (through browsers) and I do not see any move to change or drop the DOM. This is one of those constraints.

The issue of references within an HTML text that changes if the structure changes is of course known. Please, look at the way the Annotation data model handles this issue: one stores a reference that is based on various mechanisms, and it is possible to describe references like "this annotation refers to the first paragraph of section 12" without modifying the DOM. And yes, EPUB CFI is, sort of, part of the system insofar as this may be one of the ways to express that statement. If new referencing systems (e.g., to point to a specific pixel in an image) comes forward, it is possible to add it to the generic framework in the model (see the Selector concept).

The ePub standard should be enriched by a mean of informing the reader that something could have changed in the HTML, or just providing a file with "diff" changes so that the annotations can be processed to be working and pointing to the right places.

I presume you propose this to keep the references, or at least to be able to recalculate them. This is possible, but would become extremely complex very quickly. The aforementioned model of Selectors makes this unnecessary and, I believe, it is much simpler.

I do not know the W3C Web Annotations standard (not studied yet) and I do not know if it has anything to do with ePub CFI...

I think you should look at the W3C Web Annotation standard. With all the caveats that I already expressed: the technology has been defined, it is the large scale implementation that is lacking. The very existence and existing user base of Hypothes.is, based on the standard, proves its feasibility in practice, but we get back to the business interest issue (i.e., the lack thereof).

P5music commented 1 year ago

@iherman Thank you for the response. I understand all your points.

I will have a look to the W3C Web Annotations standard, because I do not know how the ePub books can come into the picture, I mean how a single copy of an ePub is referenced as a web resource to annotate some of its parts. I hope I will find information to confirm that possibility on the official documentation. I know indeed that the ePub, although it resembles a website, is not, because it is not on the internet. I have noticed that this metaphor often trick people. However I think that I will find that explanation on the docs. I imagine that the ePub is just something that has to be "connected" to the user's annotations archive inside the RS as a matter of fact, on behalf of the user.

In regard to the original flaw, I should have written it "original flaw" because I mean it only in this thread just to refer the issue I pointed out. I am a strong supporter of the ePub format and made some development with it.

So, you say that the "change log" file method would be too complex. I do not agree because when you know that your annotations are based on a certain version of the ePub book, because you have the changes.xml file, let's say, you are able to check what happened. If the data in that file are in a standard format, they have a hierarchy so it is not difficult to understand if your annotation reference was nudged, moved or deleted. The algorithm shouldn't be difficult to any developer, also because it would be devised with that purpose in mind (to convey that hierarchical information).

Such a change.xml file or change.graph (and so on, we can discuss how it should be inside) would be an useful improvement of the ePub standard. With a small effort the standard would be able to dismiss the issue and throw the ball into the industrial arena where RS can make use of it or not, as they prefer. Then, if there is no interest or response, it's not W3C's fault because it has done what was needed, in an elegant and simple way. I think we should discuss how such a file could be structured to assess how simple or difficult it would be. I think that this kind of file have been used for other applications, thus some data expert could chime in to throw in some ideas. Maybe the solution already exists or it can be very easily adapted, or be of inspiration.

Regards

mattgarrish commented 1 year ago

I feel like a broken record when these annotation discussions come up, but we worked with Rob and Paolo to develop the Open Annotation in EPUB standard way back when.

It was part of the edupub work and kind of died off with it not because of technical challenges getting it over the finish line but because of lack of interest in implementing it by reading systems. They do their own thing internally to store and maintain annotations, and having an interchange format did not garner any interest.

sueneu commented 1 year ago

@iherman @mattgarrish This is clearly a frustrating situation— cross-platform annotation keeps coming up as a desired technology. And a lot of work has gone into making that possible. But

there is no clear business interest among publishers and/or reading systems to offer an open annotation system

and a

lack of interest in implementing it by reading systems

We see our mission as

it [W3C] standardizes industry practice, or close to industry practice, rather than inventing its own solutions

However, A few reading systems monopolize the eBook industry. Increasingly fewer large publishers monopolize the publishing industry.

It makes me wonder— are we doing the best service to web users and readers if we let the reading system monopolies determine which features and tech are possible? Are we, in some small way, supporting monopolies and stifling competition and innovation?

iherman commented 1 year ago

@sueneu,

I share your frustration (I was fairly active in the development of the Web Annotation standards...). This is a situation where W3C found itself before (and I am sure it will happen again); it does not, and cannot, have a direct influence on how the market evolves. The pressure should come from other organizations, user communities, etc.

But there may be sudden changes that may trigger a change. As I wrote in https://github.com/w3c/publishingcg/issues/55#issuecomment-1439441960:

There is a certain analogy with the recent story around activity streams. […] activity streams had a very low implementation and usage level for many years. It needed the recent turmoil around Twitter to suddenly bring Mastodon to the fore, and now everybody talks about activity streams, web mentions, etc. We may need some similar storm around annotations...

I am not sure what that "storm" could be, but any good ideas are welcome!

w3c / publishingcg

Continuity of E-Library #55