WICG / proposals

A home for well-formed proposed incubations for the web platform. All proposals welcome.
https://wicg.io/
Other
226 stars 11 forks source link

Document Services #19

Closed AdamSobieski closed 3 months ago

AdamSobieski commented 3 years ago

Introduction

Document services are a generalization over services for documents such as: spelling checking, grammar checking, proofreading, fact checking, and mathematical proof and argumentation checking. Document services are relevant to both document authoring and document reviewing scenarios. Imagine being able to check, in real-time, if a document has any informational, warning, or error messages with respect to its factuality or any steps of its reasoning. Tools for authoring and reviewing documents, in these regards, would be useful across sectors, across industry, academia, military, and government, with specific applicability to journalism, encyclopedias, digital textbooks, and science.

Presented, herein, is an approach for declaring and describing document services utilizing document metadata.

Varieties of Document Services

Thus far considered are three varieties of document services. Firstly, there are services which adhere to an informational message, warning, error pattern. Secondly, there are services which offer corrections, recommendations, or options for users. Thirdly, there are services which provide metadata about documents, document elements, or ranges (e.g. word count, reading level).

User Interface Discussion

Users could make use of application menus to have entire documents processed by document services. Users could also utilize context menus on specific document elements.

For those services which return informational messages, warnings or errors about documents, document elements, or ranges, there could be table views or grid views (see also: software development IDE’s) for collecting together and presenting multi-source informational messages, warnings, and errors.

For those services which offer corrections, recommendations, or options for users, interactive contextual panels or widgets could be of use.

For those services which return metadata about documents, document elements, or ranges, the visualization of such metadata is also a user interface topic.

Styling Documents for Document Services

It may be possible for document authors to style their documents for interoperation with various document services.

Various text decorations and background colors could be utilized.

Inline graphical symbols such as green checkmark symbols, white information icon symbols, yellow warning symbols, and red error symbols could be utilized.

See also: https://www.w3.org/TR/css-highlight-api-1/

Author-recommended and Reader-specified Services

When reviewing documents, there is an envisioned interplay between author-recommended and reader-specified services. For example, for the fact-checking scenario, document authors could indicate recommended fact-checking services to make use of and readers could have their own services configured.

Multiple Simultaneous Services

Multiple document services could be utilized simultaneously. The informational messages, warnings, and errors from multiple services could be merged. Similarly, corrections, recommendations, or options from multiple services could be merged. Similarly, metadata about documents, document elements, or ranges from multiple services could be merged.

URI-addressability of Document Content

Content of interest in documents could be URI-addressable in a number of ways.

  1. https://www.example.org/document.xhtml#fact-123
  2. https://www.example.org/document.xhtml#xpointer(...)
  3. https://www.example.org/document.xhtml#:~:text=...

Firstly, document authors could make use of the id attribute. Secondly, XPointer could be utilized to address document content with URI. Thirdly, text fragments could be utilized with URI.

Document Metadata and Selectors

It is possible to utilize document metadata to declare document services without having to specify how document authors should use document markup.

Namespace-prefixable Selectors

There are numerous ways that document authors might use markup to indicate facts or claims. For example, some are:

  1. new markup elements (e.g. <fact> or <claim>)
  2. extensible markup elements (e.g. <ext:fact xmlns:ext="...">)
  3. class names (e.g. <span class="fact">)
  4. role attribute (e.g. <span role="fact">)
  5. EPUB type attribute (e.g. <span epub:type="fact">)

For each way, one can select the facts or claims in a document.

With a CSS-based syntax:

  1. fact
  2. @namespace ext url(...); ext|fact
  3. .fact
  4. [role~='fact']
  5. @namespace epub url(...); [epub|type~='fact']

With an XPath-based syntax:

  1. //fact
  2. xmlns(ext=...) //ext:fact
  3. //*[contains(concat(' ',normalize-space(@class),' '),' fact ')]
  4. //*[contains(concat(' ',normalize-space(@role),' '),' fact ')]
  5. xmlns(epub=...) //*[contains(concat(' ',normalize-space(@epub:type),' '),' fact ')]

Using the namespace-prefixable selectors, above, one could use document metadata to indicate which document elements in a document were facts or claims.

With a CSS-based syntax:

  1. <meta name="fact-checking-selector" content="fact" />
  2. <meta name="fact-checking-selector" content="@namespace ext url(...); ext|fact" />
  3. <meta name="fact-checking-selector" content=".fact" />
  4. <meta name="fact-checking-selector" content="[role~='fact']" />
  5. <meta name="fact-checking-selector" content="@namespace epub url(...); [epub|type~='fact']" />

With an XPath-based syntax:

  1. <meta name="fact-checking-selector" content="//fact" />
  2. <meta name="fact-checking-selector" content="xmlns(ext=...) //ext:fact" />
  3. <meta name="fact-checking-selector" content="//*[contains(concat(' ',normalize-space(@class),' '),' fact ')]" />
  4. <meta name="fact-checking-selector" content="//*[contains(concat(' ',normalize-space(@role),' '),' fact ')]" />
  5. <meta name="fact-checking-selector" content="xmlns(epub=...) //*[contains(concat(' ',normalize-space(@epub:type),' '),' fact ')]" />

Namespace-prefixable Attribute Selectors

There is also the matter of using document metadata to indicate which attributes, if any, are utilized by a document author to reference inline or external resources on selected document elements.

With a CSS-based syntax:

  1. attr(something url)
  2. @namespace ext url(...); attr(ext|something url)

With an XPath-based syntax:

  1. @something
  2. xmlns(ext=...) @ext:something

One could indicate which attributes on those document elements were for specifying resources by using document metadata.

With a CSS-based syntax:

  1. <meta name="fact-checking-resource" content="attr(something url)" />
  2. <meta name="fact-checking-resource" content="@namespace ext url(...); attr(ext|something url)" />

With an XPath-based syntax:

  1. <meta name="fact-checking-resource" content="@something" />
  2. <meta name="fact-checking-resource" content="xmlns(ext=...) @ext:something" />

Document Service Providers

One could indicate which document service providers were recommended by a document author using document metadata.

  1. <link rel="fact-checking-service-provider" href="https://www.wikidata.org/wiki/Special:FactCheck" />

Examples

A number of document metadata examples are provided.

Spelling Checking

<html>
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <link rel="spelling-checking-service-provider" href="https://www.services.org/spelling-checking" />
  </head>
  <body>
    <p>HTML and MathML Content</p>
  </body>
</html>

Grammar Checking

<html>
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <link rel="grammar-checking-service-provider" href="https://www.services.org/grammar-checking" />
  </head>
  <body>
    <p>HTML and MathML Content</p>
  </body>
</html>

Proofreading

<html>
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <link rel="proofreading-service-provider" href="https://www.services.org/proofreading" />
  </head>
  <body>
    <p>HTML and MathML Content</p>
  </body>
</html>

Fact Checking

<html>
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="fact-checking-selector" content="[role~='fact']" />
    <link rel="fact-checking-service-provider" href="https://www.wikidata.org/wiki/Special:FactCheck" />
  </head>
  <body>
    <span role="fact">HTML and MathML content</span>
    <div  role="fact">HTML and MathML content</div>
  </body>
</html>

Metadata

<html xmlns:ext="http://www.namespace.org/extensibility#">
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="metadata-selector" content="@namespace ext url('http://www.namespace.org/extensibility#'); [ext|meta]" />
    <meta name="metadata-resource" content="@namespace ext url('http://www.namespace.org/extensibility#'); attr(ext|meta url)" />
    <script id="inline-metadata-123" type="...">...</script>
  </head>
  <body>
    <span ext:meta="#inline-metadata-123">HTML and MathML Content</span>
    <div  ext:meta="external-metadata-124.php">HTML and MathML Content</div>
  </body>
</html>

Provenance

<html xmlns:ext="http://www.namespace.org/extensibility#">
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="provenance-selector" content="@namespace ext url('http://www.namespace.org/extensibility#'); [ext|provo]" />
    <meta name="provenance-resource" content="@namespace ext url('http://www.namespace.org/extensibility#'); attr(ext|provo url)" />
    <script id="inline-provenance-123" type="...">...</script>
  </head>
  <body>
    <span ext:provo="#inline-provenance-123">HTML and MathML Content</span>
    <div  ext:provo="external-provenance-124.php">HTML and MathML Content</div>
  </body>
</html>

Mathematical Proof

<html xmlns:ext="http://www.namespace.org/extensibility#">
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="proof-selector" content="@namespace ext url('http://www.namespace.org/extensibility#'); [ext|proof]" />
    <meta name="proof-resource" content="@namespace ext url('http://www.namespace.org/extensibility#'); attr(ext|proof url)" />
    <script id="inline-proof-123" type="...">...</script>
  </head>
  <body>
    <math ext:proof="#inline-proof-123">MathML Content</math>
    <math ext:proof="external-proof-124.php">MathML Content</math>
  </body>
</html>
<html>
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="proof-selector" content="math.proveable" />
    <link rel="proof-service-provider" href="https://www.services.org/proof" />
  </head>
  <body>
    <math class="proveable">MathML Content</math>
    <math class="proveable">MathML Content</math>
  </body>
</html>

Mathematical Proof Checking

<html>
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="proof-selector" content="math.proveable" />
    <meta name="proof-resource" content="attr(data-proof url)" />
    <link rel="proof-checking-service-provider" href="https://www.services.org/proof-checking" />
    <script id="inline-proof-123" type="...">...</script>
  </head>
  <body>
    <math class="proveable" data-proof="#inline-proof-123">MathML Content</math>
    <math class="proveable" data-proof="external-proof-124.php">MathML Content</math>
  </body>
</html>

Argumentation

<html xmlns:ext="http://www.namespace.org/extensibility#">
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="argumentation-selector" content="@namespace ext url('http://www.namespace.org/extensibility#'); [ext|argu]" />
    <meta name="argumentation-resource" content="@namespace ext url('http://www.namespace.org/extensibility#'); attr(ext|argu url)" />
    <script id="inline-argu-123" type="...">...</script>
  </head>
  <body>
    <span ext:argu="#inline-argu-123">HTML and MathML Content</span>
    <div  ext:argu="external-argu-124.php">HTML and MathML Content</div>
  </body>
</html>

Argumentation Checking

<html xmlns:ext="http://www.namespace.org/extensibility#">
  <head>
    <base href="https://www.example.org/document.xhtml" />
    <meta name="argumentation-selector" content="@namespace ext url('http://www.namespace.org/extensibility#'); [ext|argu]" />
    <meta name="argumentation-resource" content="@namespace ext url('http://www.namespace.org/extensibility#'); attr(ext|argu url)" />
    <link rel="argumentation-checking-service-provider" href="https://www.services.org/argumentation-checking" />
    <script id="inline-argu-123" type="...">...</script>
  </head>
  <body>
    <span ext:argu="#inline-argu-123">HTML and MathML Content</span>
    <div  ext:argu="external-argu-124.php">HTML and MathML Content</div>
  </body>
</html>

Types of Resources

A number of types of resources could be involved in document services scenarios.

Service-specific Data Formats

Data could be served in data formats specific to document services, e.g. mathematical proofs and argumentation, and these service-specific formats could be consumed by other document services, e.g. mathematical proof checking and argumentation checking.

Hypertext-embedded Data Formats

Data could be served embedded in HTML documents, e.g. with RDFa or microdata, for simultaneous machine-utilizability and human-readability.

Markup for Informational Messages, Warnings, and Errors

<messages xmlns="..." xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <m kind="info" type="..." about="https://www.example.org/document.xhtml#fact-123">This is an informative message.</m>
  <m kind="info" type="..." about="https://www.example.org/document.xhtml#xpointer(...)">This is an informative message.</m>
  <m kind="info" type="..." about="https://www.example.org/document.xhtml#:~:text=...">This is an informative message.</m>
</messages>

or

<messages xmlns="..." xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <m kind="info" type="..." start="https://www.example.org/document.xhtml#xpointer(...)" end="https://www.example.org/document.xhtml#xpointer(...)">This is an informative message.</m>
</messages>

See also: https://dom.spec.whatwg.org/#concept-range

Document and Document Element Metadata Formats

Information about documents or document elements could be provided in service response data. Example scenarios include providing the word count or reading level of a document or portion of a document.

<response xmlns="...">
  <metadata type="application/rdf+xml">
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ext="...">
      <rdf:Description rdf:about="https://www.example.org/document.xhtml">
        <ext:wordCount rdf:datatype="http://www.w3.org/2001/XMLSchema#int">1234</ext:wordCount>
      </rdf:Description>
    </rdf:RDF>
  </metadata>
</response>

URL-formulation Formats

Technologies such as OpenSearch utilize XML to provide URL-addressable services using a curly-brackets-based syntax for specifying how URL's should be formed from data. OpenSearch description documents are served with the MIME type of application/opensearchdescription+xml.

An example of OpenSearch markup:

<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/" xmlns:moz="http://www.mozilla.org/2006/browser/search/">
  <ShortName>Wikipedia (en)</ShortName>
  <Description>Wikipedia (en)</Description>
  <Image height="16" width="16" type="image/x-icon">https://en.wikipedia.org/static/favicon/wikipedia.ico</Image>
  <Url type="text/html" method="get" template="https://en.wikipedia.org/w/index.php?title=Special:Search&amp;search={searchTerms}" />
  <Url type="application/x-suggestions+json" method="get" template="https://en.wikipedia.org/w/api.php?action=opensearch&amp;search={searchTerms}&amp;namespace=0" />
  <Url type="application/x-suggestions+xml" method="get" template="https://en.wikipedia.org/w/api.php?action=opensearch&amp;format=xml&amp;search={searchTerms}&amp;namespace=0" />
  <moz:SearchForm>https://en.wikipedia.org/wiki/Special:Search</moz:SearchForm>
</OpenSearchDescription>

Web Services Description Language

Web Services Description Language (WSDL) is an XML language for describing Web services. It is served with the MIME type of application/wsdl+xml.

Conclusion

Some ideas were presented towards facilitating document services, such as real-time fact-checking and reasoning-checking, for HTML documents.

These ideas are also expressed in a document available at: https://www.w3.org/community/argumentation/wiki/Document_Services.

There is also, presently, a Document Services Community Group in a proposal stage. If working on these or related ideas is of interest to you, please feel free to support the creation of the group and to join the group (https://www.w3.org/community/groups/proposed/#services).

I look forward to discussing these ideas with you. Feedback, comments, ideas, suggestions are welcomed. Thank you.

johanneswilm commented 3 years ago

I think you have to differentiate between A) checking the contents of a website and then for example send it to some third party service, receive the result and render some user interface on top of the website saying that the contents of this page seem correct or false, and B) using a grammar or spell checker that changes the contents of a contenteditable element automatically. B will almost certainly create gigantic amounts of problems for people writing texts and lead to a lot of frustration. It could still be helpful to have some common way of interaction with these services for spelling and grammar control, but the communication should have to go through the JavaScript editor application, which then is responsible for communicating with the service and for handling what it returns in a sensible way.

johanneswilm commented 3 years ago

Also - why do you return XML? That's generally much more complex to handle in JavaScript than receiving JSON. Take a look at the API of the open source Languagetool project: https://languagetool.org/http-api/swagger-ui/#!/default/post_check . It returns JSON that is easy to handle.

AdamSobieski commented 3 years ago

Thank you. Based on your recommendation, I’m presently looking at OpenAPI and Swagger. Instead of XML-based, OpenSearch-based, and WSDL-based approaches, the proposed Community Group could explore JSON-based, OpenAPI-based, and Swagger-based approaches. Instead of producing one or more new XML-based formats, the proposed community group could produce one or more JSON schemas.

I’m considering scenarios which involve seamless logging in, session management, and logging out of client-local, on-prem, and remote services. I’m guessing that this is simple to do with OpenAPI and Swagger. In this way, end-users could easily setup, configure, and make use of both free and paid-subscription document services.

With respect to document services, there are to consider both "document reviewing" and "document authoring" scenarios.

By "document reviewing", I mean that end-users might want to spellcheck or grammar-check other authors' documents. In the domain of fact-checking, a document author could make use of document metadata to indicate one or more fact-checking services which they utilized while authoring a document and which they recommend for readers/reviewers to use, saying that, at the time of a document's creation, the document was verified by one or more indicated services. Additionally, users could setup, configure, and make use of their own client-local, on-prem, or remote fact-checking services.

As envisioned, "document reviewing" document services could be utilized via Web browser application menus and/or context menus on document elements or content ranges.

For "document authoring" scenarios, there are more complex cases to consider including cases where document services are desired to provide dynamic results and visualizations as end-users edit and update document content.

johanneswilm commented 3 years ago

By "document reviewing", I mean that end-users might want to spellcheck or grammar-check other authors' documents. In the domain of fact-checking, a document author could make use of document metadata to indicate one or more fact-checking services which they utilized while authoring a document and which they recommend for readers/reviewers to use, saying that, at the time of a document's creation, the document was verified by one or more indicated services. Additionally, users could setup, configure, and make use of their own client-local, on-prem, or remote fact-checking services.

That makes sense.

I still think there is a major distinction between doing this in an editor or doing this on a web page of largely static content. In the case of a review of a document another user has written, both might come in handy: I might want to fact check a static website that I only have read access right to. There may be a minor interest in some special cases to also spell and grammar check it even though the user won't be able to modify the document.

Or the user may have been invited by another user to check/review their document and therefore have received write access and look at it within an editing interface. In this second case, the JS editing app might want to hook into these services, but if the services are applied without being registered with the JS editor app (for example by means of a browser plugin that modifies the DOM due to spelling errors), that could actually crash the page and so I would want to avoid that. This has been an issue in the past with at least one of the major commercial services providing this feature. One could for example imagine that the page that includes a JS editing app, adds something to the pages header that tells the browser plugin to not do any direct dom changes and instead communicate with the JS editor app through a given interface or some such thing.

AdamSobieski commented 3 years ago

Interoperability with JavaScript document editor applications and providing high-level interface specifications makes sense. JavaScript document editing applications hooking into extensible sets of documents services is an important scenario. Perhaps something like navigator.services, window.services, or document.services.

For both "document authoring" and "document reviewing", I am presently collecting together types of document services: spellchecking, grammar checking, proofreading, fact checking, mathematical proof checking, reasoning checking, and argumentation checking.

For scalability towards reasoning checking and argumentation checking, there could be multiple ranges or selections of referenced content per informational message, warning, or error. When an end-user selects an informational message, warning, or error, so doing could highlight one or multiple ranges or selections of related document content. Hovering over hyperlinks in a message's content could highlight individual ranges or selections and clicking upon such hyperlinks could scroll to individual ranges or selections.

I am also presently considering the shareability of informational messages, warnings, or errors, for example through the Web Share API. Each informational message, warning, or error could, optionally, at a service provider's determination, provide a URL for sharing which would result in a share button being placed upon it in a user interface. For some fact-checking services, then, end-users could easily share, distribute, and disseminate fact-checking information with one another.

AdamSobieski commented 3 years ago

I found some tools which automatically process TypeScript into JSON schemas and so I am sketching in TypeScript.

Here is a rough-draft sketch of what the above descriptive paragraphs might look like in TypeScript:

enum MessageType
{
    info = "info",
    warning = "warning",
    error = "error"
}

interface RangeSerialization
{
    readonly xpathStart: string;
    readonly offsetStart: number;
    readonly xpathEnd: string;
    readonly offsetEnd: number;
}

interface Message
{
    readonly type: MessageType;
    readonly ranges: RangeSerialization[];
    readonly content: string;
    /*
     * @default "text/plain"
     */
    readonly contentType: string;

    /*
     * @default false
     */
    readonly shareable: boolean;
    /*
     * @TJS-format uri
     */
    readonly shareUrl?: string;
    readonly shareTitle?: string;
    readonly shareText?: string;
}

Where, to deserialize a RangeSerialization into a Range, one could do something like:

function deserializeRange(doc: Document, rs: RangeSerialization, resolver: XPathNSResolver): Range
{
    var n0 = doc.evaluate(rs.xpathStart, doc, resolver, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
    var n1 = doc.evaluate(rs.xpathEnd, doc, resolver, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;

    var retval = doc.createRange();
    retval.setStart(n0, rs.offsetStart);
    retval.setEnd(n1, rs.offsetEnd);
    return retval;
}

(see also: Rangy and range-serializer)

Any thoughts on the rough-draft Message interface?

lrosenthol commented 3 years ago

@AdamSobieski I think the biggest problem that I see with your proposal is that it only is thinking about HTML authoring and/or publishing, yet it talks about documents. Most documents are authored in various authoring formats - Word, Google Docs, Markdown, etc. - and then published in alternative formats such as HTML, EPUB or PDF. Accordingly, I don't see how your proposal would be able to apply in any of those scenarios.

Looking forward to any clarification.

johanneswilm commented 3 years ago

I second @lrosenthol on that if the idea is to send over HTML formatted text and only that.

When using above-mentioned languagetool, I first convert the content to plaintext in the client before it is sent to the server and receive comments back mapped to places within that text which the client can then map back to the original place in the document and show there.

It would be helpful to be able to insert some placeholders within that plaintext to communicate to the server app that there are things that act like a linebreak or a space or similar without actually being a space. For example, in my editor, I have inline formulas and inline images to deal with. If I just send in the text to be corrected and leave these out altogether, the spell checker will then think that the word before and after the inline image are just one word. So what I currently do is add another space in place of the image and make sure to take that into consideration when the spell checker returns its result. I then also need to check that there are no corrections for that space, as I then need to remove those before it is presented to the end-user. And because it's not quite correct to have a space there, the corrections may actually also be influenced by that - especially the grammar corrections. So instead of a space - maybe define one particular unicode character as being an inline placeholder for non-editable content?

AdamSobieski commented 3 years ago

Thank you. As I think about the points raised, I hope that consensus forms in the proposed community group towards designing architecture, API's, and protocols for document services capable of processing multiple MIME types as input. Protocols, in this case, would include a means of asking document service providers which MIME types that they can process as input.

Relevant MIME types include: text/html or application/xhtml+xml for hypertext content as well as text/plain, text/markdown, text/wiki, application/epub+zip, application/pdf, and application/vnd.openxmlformats-officedocument.wordprocessingml.document.

I am recently exploring the Web Annotation Data Model, in particular its discussion of selectors, with respect to describing selections of content from both text-based and XML-based document formats.

Presently, document service providers (e.g. spellchecking, grammar checking, proofreading, fact checking, mathematical proof checking, reasoning checking, argumentation checking, and narrative checking) each have to make and maintain their own browser extensions and these extensions are not necessarily interoperable. The proposed community group intends to create general-purpose architecture, API's, and protocols for both free and paid-subscription-based document services to convenience document service providers and to equip and empower end-users who could then make use of multiple document services simultaneously to better author and review documents.

lrosenthol commented 3 years ago

@AdamSobieski A few more comments...

While I think your concept is very interesting, I am not convinced that you can develop a server-agnostic API, specifically the aspect of the return payload as each type of service will need to return different information in different serializations.

johanneswilm commented 3 years ago

While I think your concept is very interesting, I am not convinced that you can develop a server-agnostic API, specifically the aspect of the return payload as each type of service will need to return different information in different serialization.

What should be possible is to have a common way of expressing that there are corrections attached to certain content and how to get to that content.

I am thinking as for formats, plaintext with markers for non-editable content should usually be enough. You can then have for example a browser extension that converts HTML to that plaintext-based format and one browser extension can be reused for several alternative spell checking services as they all communicate through the same API. You could possible even use the exact same APi for all those other services mentioned.

What would the benefit be of the APi being able to respond with Epub and PDF content? Would it not be simpler to just have the PDF editor/viewer convert some text to plaintext and then implement in a PDF-specific way how the results are mapped back to the content?

lrosenthol commented 3 years ago

plaintext with markers for non-editable content should usually be enough

Using only Plaintext with markers is (for the most part) the equivalent of tying the software's hands behind its back. It removes all other useful information (eg. styling, semantics, etc.) in the understanding of the content.

What would the benefit be of the APi being able to respond with Epub and PDF content?

The benefit would be tool in question being able to (a) analyze the original content and its associated styling & semantics and then (b) return material that contains not only content but also styling and/or semantics. A simple example is that a spell checker knowing that some text is in italics would be a useful clue to treat it as a book title instead of standard text (as the capitalization rules for each are quite different!). A related example, using semantics, would be to know that a piece of text is an <H2>, where again the capitalization rules for headings differ from standard content.

(and before you go suggesting that something like Markdown be used, let me remind you that Markdown is NOT standardized and there are a myriad of variants...and you'd need an existing standard to refer to)

johanneswilm commented 3 years ago

plaintext with markers for non-editable content should usually be enough

Using only Plaintext with markers is (for the most part) the equivalent of tying the software's hands behind its back. It removes all other useful information (eg. styling, semantics, etc.) in the understanding of the content.

Ok, could you help me understand. Do you insert the styling into the spell- and grammarchecker then? Are there spell checkers as of now that take formatting such as bold and font-size 28px into consideration?

What would the benefit be of the APi being able to respond with Epub and PDF content?

The benefit would be tool in question being able to (a) analyze the original content and its associated styling & semantics and then (b) return material that contains not only content but also styling and/or semantics. A simple example is that a spell checker knowing that some text is in italics would be a useful clue to treat it as a book title instead of standard text (as the capitalization rules for each are quite different!). A related example, using semantics, would be to know that a piece of text is an <H2>, where again the capitalization rules for headings differ from standard content.

Is that happening today? I would see lots of problems with such a spell checker if the editing application cannot control it.

(and before you go suggesting that something like Markdown be used, let me remind you that Markdown is NOT standardized and there are a myriad of variants...and you'd need an existing standard to refer to)

I am not suggesting markdown. But ok, let's say that there are spell checkers out there that take font color, italics, bold, etc. into consideration. Would it not be enough to just support one richtext format - for example a simplified version of HTML? Or why would it also need to be able to read PDF-syntax?

johanneswilm commented 3 years ago

Here is another spell checker with an API. Similarly to languagetool, it looks to me as if they are only checking plaintext. The response looks like it is in a similar format. https://www.grammarbot.io/quickstart

lrosenthol commented 3 years ago

Are there spell checkers as of now that take formatting such as bold and font-size 28px into consideration?

Absolutely! The one built into MSOffice, for example, uses styling and semantics to provide a better experience.

What would the benefit be of the APi being able to respond with Epub and PDF content?

In the case of spell checking, it might not, but for many of those other use cases the ability to provide the changes natively in the source format would be critical. For example, consider @AdamSobieski 's example of Math Proof Checking. You would want/need a system that can understand the way that math is encoded in various formats (eg. LaTeX vs. MathML vs. ...), determine any necessary corrections, and then modify the content to provide the corrected version (again, in the correct format).

Is that happening today? I would see lots of problems with such a spell checker if the editing application cannot control it.

Today this smart checking only happens in the context of the authoring application. As you note, there are a variety of open solutions (be they APIs or open source libraries) that are text only - not because it's the right thing to do, but because it is (a) easy and (b) avoids issues with various formats. But as noted, it also means the worst possible results.

Would it not be enough to just support one richtext format - for example a simplified version of HTML?

So who would be responsible for defining that "one richtext format"? How do you (or do you?) keep it up to date with the core specification (eg. HTML - the "living standard")?

johanneswilm commented 3 years ago

Absolutely! The one built into MSOffice, for example, uses styling and semantics to provide a better experience.

Ok, so can we isolate what kind of information that spell checker gets access to? Are there others we need to look at so we can make a list of common things they look at?

As someone who has an editor, I'd be interested in how I should interact with such a spell checker. If I make a certain word blue - does that communicate a sad tone to the spell checker? If that's the case, then maybe instead I should tell the spell checker that the text is yellow so that it gets a different idea if my web app knows that the text is the content of a greeting card.

I wonder if maybe instead of me sending over HTML with text that is styled in a specific color because I am guessing that the spell checker will interpret it this or that way, it would be better if I could communicate directly to the spell checker that this text is to have this or that mood. What is the advantage of going through the formatting step rather than communicating semantic information directly?

You would want/need a system that can understand the way that math is encoded in various formats (eg. LaTeX vs. MathML vs. ...), determine any necessary corrections, and then modify the content to provide the corrected version (again, in the correct format).

I agree that you probably need this for math formulas in some of the common formats as it is not straightforward to translate between them. I'm not sure you can translate everything with certainty between OMML, MML and LaTeX. But that's a slightly different question than whether you need to be able to support PDF, HTML, XHTML, EPUB, and maybe even RTF, DOCX and five other formats commonly used. Does that Microsoft spell checker that takes styling into consideration process information that cannot be expressed as HTML/CSS as it is now? Are there other spell checkers that do?

So who would be responsible for defining that "one richtext format"? How do you (or do you?) keep it up to date with the core specification (eg. HTML - the "living standard")?

I understand that this is an issue that is specific to formatting-based spell and grammarcheckers, right? Surely there are some limits to how much HTML they need to understand. Do they need to understand SVGs for example? How about PNGs or canvas elements? If there is a chance that they want all that information - ok, try to send the entire HTML. But if you go as far as say the server needs to accept any of ten different formats, then I have a hard time seeing who will be able to and want to create such a spell checker that complies with the full spec.

AdamSobieski commented 3 years ago

While I think your concept is very interesting, I am not convinced that you can develop a server-agnostic API, specifically the aspect of the return payload as each type of service will need to return different information in different serializations.

While the list of document services is lengthy and open-ended, interestingly, there are, thus far, four varieties of document services considered:

Firstly, there is the variety of document service which provides metadata about documents, document elements, or ranges of content. For example, such metadata could be word count or reading level.

Secondly, there is the variety of document service which returns typed annotations with respect to documents, document elements, or ranges of content. For example, typed annotations could be categorized into informational messages, warnings, and errors.

Thirdly, there is the variety of document service which offers corrections, recommendations, or options for end-users. Examples include spelling and grammar checking.

Fourthly, there is the variety of document service which provides interactive diagrams, graphs, visualizations, or reports about documents, document elements, or ranges of content. This most recent variety is based on observing a service which automatically produces multimedia reports (http://authors.ai).

AdamSobieski commented 3 years ago

@lrosenthol , I edited the previous message in this thread, per your recommendation, to clarify that EPUB is included.

Also, I started sketching, in C# and TypeScript, some API and interfaces for the four types of document service providers, starting with the second type: IAnnotationServiceProvider, which happens to also be relevant to a related project proposal.

With respect to the Document Services Community Group, the new group has launched and now begins the challenges of inviting participants and developing the architecture, API, and protocols.

AdamSobieski commented 2 years ago

I would like to follow up to indicate that I recently came across the Language Server Protocol. I am exploring its specification while considering that something based on it could be applicable to natural-language document-processing scenarios.

Any thoughts on this?

LJWatson commented 5 months ago

@AdamSobieski thank you for your thorough proposal, and @johanneswilm and @lrosenthol for your comments. Is there any update from the CG that was created?

AdamSobieski commented 5 months ago

@LJWatson, hello. Unfortunately, that CG closed due to inactivity.

On a related note: I found a newer WICG proposal, #136, for adding a type attribute to <textarea> elements. This newer proposal also pertains to providing users with services for text-based contents, including and beyond natural-language spellchecking and grammar-checking. As I understand the idea and proposal, a type attribute would allow text-entry and text-editing elements, such as <textarea>, to provide syntax-specific features, e.g., for markdown.