New attribute to control UA-provided writing assistance

bmathwig commented 1 year ago

The current specification allows for the autocomplete attribute to exist on elements of type <input>, <textarea>, and <select>. With the rise in popularity of rich text controls using contenteditable, we should consider allowing elements who have contenteditable=true to utilize the autocomplete attribute. While not a common scenario within the scope of form fields, there are applications for text hinting and autofill within contenteditable elements.

One existing example of a form field being replaced by contenteditable exists in the example section of its specification.

annevk commented 1 year ago

cc @whatwg/forms

bmathwig commented 1 year ago

Microsoft is interested in implementing this in Edge and Chromium :)

zcorpan commented 1 year ago

cc @DimiDL @galich @masayuki-nakano

zcorpan commented 1 year ago

Autocomplete works differently on different form controls, see https://html.spec.whatwg.org/#inappropriate-for-the-control

Editing hosts don't have a way to signal what kind of input is accepted (e.g. single line vs multiline). Which "groups" should editing hosts be part of? All of them?

annevk commented 1 year ago

WebKit is also interested in this.

(For maximum clarity, Chromium and Edge count as a single implementer for WHATWG purposes.)

bmathwig commented 1 year ago

Autocomplete works differently on different form controls, see https://html.spec.whatwg.org/#inappropriate-for-the-control

Editing hosts don't have a way to signal what kind of input is accepted (e.g. single line vs multiline). Which "groups" should editing hosts be part of? All of them?

I think contenteditable will always be a Text-Multiline host. I can't think of any cases where the other groups would apply. We also may want to expand the Field Name to include additional types of content in the future.

bmathwig commented 1 year ago

Here is our proposal to adjust the wording of 4.10.18.7.1 Autofill to allow for editing host elements to be eligible for autofill and the autocomplete attribute.

https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

sanketj commented 1 year ago

@annevk, @domenic, @mfreed7, @zcorpan: Curious to hear your thoughts on the proposal Ben shared above.

mfreed7 commented 1 year ago

@annevk, @domenic, @mfreed7, @zcorpan: Curious to hear your thoughts on the proposal Ben shared above.

Seems reasonable, but I'm not an autocomplete expert. I do worry about the leakage of sensitive information, if autocomplete can more easily be tricked into filling general <div> with PII. But as mentioned, that risk already exists with <input> so I'm not sure why this would be worse. @battre would have better input from Chrome's side.

annevk commented 1 year ago

What are the considerations around events? contenteditable involves quite a bit more events, do any need to be simulated when autofilling? The solution needs to address that somehow.

cc @johanneswilm

johanneswilm commented 1 year ago

@bmathwig What kind of input type are you thinking of using for the corresponding beforeinoput and input events? And how does this fit with Microsoft's plans with replacing a lot of contenteditable usage with EditContext which Microsoft is also working on?

In the examples mentioned in your proposal, where would the suggestions of autocompletion for the code editor come from? Is this one of the existing autocomplete types like address, etc.? Would it work in the middle of an element with other content preceding and following it or will it replace the entire content of the contenteditable element?

johanneswilm commented 1 year ago

@bmathwig Also, will the auto-complete text that is stored in the browser contain richtext itself? So if the user fills in their address in one place and uses <b> around the last name, will there be a <b>-tag around the last name when reinserting it somewhere else? If yes, how will that work if inserting into a different website where the editor uses <strong> instead of <b>? And how about code editors where styling is used differently from editor to editor?

battre commented 1 year ago

tl:dr; I have a couple of concerns about this proposal which basically boil down to the point that the proposal endorses the use of <div contenteditable> for something that's semantically a form control but does not feel and behave like a form control anymore. I would prefer if websites used form controls for forms and <div contenteditable> for editable content. Otherwise, I think that autofill may work worse than today.

Here are the details.

We observe that

the majority of <input> elements don't have autocomplete attributes,
the current autocomplete spec is not expressive enough for addresses in most countries.

Chrome compensates for these problems as well as possible by running heuristics in the browser and crowdsourcing. My concern is that if <div contenteditable> becomes the new <input> (either because libraries use it or because it's the new best practice you find on stackoverflow), we may lose the capability to classify fields.

Semantic grouping: Today most form controls that belong together are semantically grouped via <form> tags. I expect that this would become less the case if people don't think in terms of forms but in terms of <div contenteditable>s because they cannot be associated with a <form> tag and look and feel like layout components, not like form components. With the loss of <form> tags our client-side heuristics would struggle to find the boundaries between semantically unrelated forms (search box, login form, sign-up form, shipping address form, chat box, ...) which can co-exist on a website. (Not a new problem but one that will become more pronounced).
Loss of signals for heuristics: Developer documentation for <input> elements suggests to assign name attributes to fields and we see that developers do this a lot (even though they may submit the data via Fetch - after all the Internet is made of copy&paste from tutorials ;-)). This gives us semantic hints about the meaning of fields. With the loss of <form> tags and name attributes, Chrome would lose the capability to do meaningful crowdsourcing of field semantics (a "form" becomes harder to reference) and the heuristics would lose an important signal that helps assigning meaning to a field. (Not a new problem but one that will become more pronounced).
Form submission detection is hard if we don't have a <form> that's POSTed via a submit(). We have built many complex heuristics as proxies for candidates for form submission events, such as checking whether a <form> is taken out of the DOM or made invisible. This, again, would become more brittle if we didn't have <form>s. With the loss of <form> tags it would become increasingly difficult to see submissions, which we use to ask the user whether they want to save their saved password, credit card, address, ...

In summary, I believe that are be better off if fields that are semantically parts of a form remain form controls.

If <textarea> is not styleable enough, could we introduce <textarea richcontent> or something like that remains a form control and is associated with a <form> but can have DOM children like a <div contenteditable>? That might be nice from the perspective of posting a form with Fetch and go in line with <selectlist>s, which make <form>s more powerful rather than pushing users to custom solutions built from <div>s.

All that said, @johanneswilm raises a lot of good questions that are also unclear to me and would pertain to such a <textarea richcontent>.

sanketj commented 1 year ago

tl:dr; I have a couple of concerns about this proposal which basically boil down to the point that the proposal endorses the use of <div contenteditable> for something that's semantically a form control but does not feel and behave like a form control anymore. I would prefer if websites used form controls for forms and <div contenteditable> for editable content. Otherwise, I think that autofill may work worse than today.

Here are the details.

We observe that

the majority of <input> elements don't have autocomplete attributes,

the current autocomplete spec is not expressive enough for addresses in most countries.

Chrome compensates for these problems as well as possible by running heuristics in the browser and crowdsourcing. My concern is that if <div contenteditable> becomes the new <input> (either because libraries use it or because it's the new best practice you find on stackoverflow), we may lose the capability to classify fields.

Semantic grouping: Today most form controls that belong together are semantically grouped via <form> tags. I expect that this would become less the case if people don't think in terms of forms but in terms of <div contenteditable>s because they cannot be associated with a <form> tag and look and feel like layout components, not like form components. With the loss of <form> tags our client-side heuristics would struggle to find the boundaries between semantically unrelated forms (search box, login form, sign-up form, shipping address form, chat box, ...) which can co-exist on a website. (Not a new problem but one that will become more pronounced).

Loss of signals for heuristics: Developer documentation for <input> elements suggests to assign name attributes to fields and we see that developers do this a lot (even though they may submit the data via Fetch - after all the Internet is made of copy&paste from tutorials ;-)). This gives us semantic hints about the meaning of fields. With the loss of <form> tags and name attributes, Chrome would lose the capability to do meaningful crowdsourcing of field semantics (a "form" becomes harder to reference) and the heuristics would lose an important signal that helps assigning meaning to a field. (Not a new problem but one that will become more pronounced).

Form submission detection is hard if we don't have a <form> that's POSTed via a submit(). We have built many complex heuristics as proxies for candidates for form submission events, such as checking whether a <form> is taken out of the DOM or made invisible. This, again, would become more brittle if we didn't have <form>s. With the loss of <form> tags it would become increasingly difficult to see submissions, which we use to ask the user whether they want to save their saved password, credit card, address, ...

In summary, I believe that are be better off if fields that are semantically parts of a form remain form controls.

If <textarea> is not styleable enough, could we introduce <textarea richcontent> or something like that remains a form control and is associated with a <form> but can have DOM children like a <div contenteditable>? That might be nice from the perspective of posting a form with Fetch and go in line with <selectlist>s, which make <form>s more powerful rather than pushing users to custom solutions built from <div>s.

All that said, @johanneswilm raises a lot of good questions that are also unclear to me and would pertain to such a <textarea richcontent>.

Thanks @battre! The intent of this proposal is not to make contenteditable elements targets for form fill (although it technically already can be today - see below). Rather, it is to extend the scope of the autocomplete attribute beyond just form autofill scenarios. For editable regions, the use cases for autocomplete are mainly for writing assistance to allow the user to write faster, not necessarily for filling forms.

There are a couple of subtleties in terms of interactions with form elements:

For text control elements (ex. textarea), UAs may use autocomplete for both writing assistance and form fill use cases.
It is technically possible for a contenteditable to be form associated if it is part of a form-associated custom element (https://html.spec.whatwg.org/multipage/forms.html#categories). For such cases, similar to other text control elements, UAs may use autocomplete on contenteditables to signal both writing assistance and form fill.

An alternative solution could be to create a new attribute to support "autocomplete for writing assistance" scenarios. However, since these are also autocompletion scenarios, it would be ideal to just reuse the existing autocomplete attribute.

sanketj commented 1 year ago

What kind of input type are you thinking of using for the corresponding beforeinput and input events?

Autocompletion on input elements fires these events today. I would expect them to fire similarly for contenteditables.

In the examples mentioned in your proposal, where would the suggestions of autocompletion for the code editor come from?Is this one of the existing autocomplete types like address, etc.? Would it work in the middle of an element with other content preceding and following it or will it replace the entire content of the contenteditable element?

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

Also, will the auto-complete text that is stored in the browser contain richtext itself? So if the user fills in their address in one place and uses <b> around the last name, will there be a <b>-tag around the last name when reinserting it somewhere else? If yes, how will that work if inserting into a different website where the editor uses <strong> instead of <b>? And how about code editors where styling is used differently from editor to editor?

Yes, preserving rich text does seem quite tricky to get right and unclear how useful it would be. Do you have scenarios in mind where this might be desirable? Storing and inserting the autocomplete text as plain text seems sufficient.

johanneswilm commented 1 year ago

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

In that case, I would think it's a bad idea to try to add this to contenteditable. Contenteditable elements are generally controlled by thousands of lines of JavaScript that try to ensure that similar markupis produced all platforms and browsers. Firefox was the last browser to remove some major elements that worked differently from the other browsers (table controls). Introducing new issues that work differently doesn't seem like a good idea.

That's different for input[type=text] and textarea elements as they produce simple text. Even if the same text is first edited in one UA and then another, there is generally no problem if the UAs behave somewhat differently (with the exception of line endings in some scenarios, but those can be fixed with a single line of JavaScript).

Given that JS editors are such large programs also means they have plugins that provide auto-completion [1] that work highly specific for a given type of content. It seems like it would be difficult to create a one-size-fits-all model that is UA specific to replace all of these.

[1] For example https://ckeditor.com/cke4/addon/autocomplete or https://github.com/curvenote/editor/tree/main/packages/prosemirror-autocomplete

sanketj commented 1 year ago

What are the considerations around events? contenteditable involves quite a bit more events, do any need to be simulated when autofilling? The solution needs to address that somehow.

What categories of events are you referring to? It seems like eventing for autocomplete should work similar to the user just replacing/inserting that content via manual input. Events for input, composition, etc. are already fired on text control elements like this, I would expect those events to also work the same way for contenteditables.

sanketj commented 1 year ago

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

In that case, I would think it's a bad idea to try to add this to contenteditable. Contenteditable elements are generally controlled by thousands of lines of JavaScript that try to ensure that similar markupis produced all platforms and browsers. Firefox was the last browser to remove some major elements that worked differently from the other browsers (table controls). Introducing new issues that work differently doesn't seem like a good idea.

That's different for input[type=text] and textarea elements as they produce simple text. Even if the same text is first edited in one UA and then another, there is generally no problem if the UAs behave somewhat differently (with the exception of line endings in some scenarios, but those can be fixed with a single line of JavaScript).

Given that JS editors are such large programs also means they have plugins that provide auto-completion [1] that work highly specific for a given type of content. It seems like it would be difficult to create a one-size-fits-all model that is UA specific to replace all of these.

[1] For example https://ckeditor.com/cke4/addon/autocomplete or https://github.com/curvenote/editor/tree/main/packages/prosemirror-autocomplete

I would expect autocomplete to only support plain text content, perhaps that can be made explicit in the spec. Thus, this would work similar to the user manually replacing/inserting that same content via text input methods (ex. typing, composition), which wouldn't be site breaking.

johanneswilm commented 1 year ago

Yes, preserving rich text does seem quite tricky to get right and unclear how useful it would be. Do you have scenarios in mind where this might be desirable?

Looking at the kind of autocomplete existing richtext editors based on contenteditable do, they would for example put tags around a specific term that was inserted through auto-completion to give it a different color or style. The code editor mentioned in your explainer [1] would likely need to do that if it is supposed to work like other web-based code editors.

I would expect autocomplete to only support plain text content, perhaps that can be made explicit in the spec.

Ok that would remove one potential issue.

But if I understand you correctly, it would be up to the UA whether to replace the entire contents or just add something new, right? So the code editor in your example could work in Safari on Mac in a way where it would just suggest an entire code snippet to the user and then replace everything else in there, whereas on in Edge on Windows it may give suggestions for specific terms to be used within the code editor? If that is the case, who would opt for using this feature that is only working sometypes as required for some users rather than use one of the existing JavaScript code editors with an existing auto-complete plugin that work all the time and everywhere?

I'm thinking maybe the usecase for this is something else, such as an address input on a simple form field where the user wants to use a contenteditable element instead of a textarea for some reason - maybe because there are situations where there could be richtext in the address? That then carries with it the issues mentioned by @battre above.

[1] https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

sanketj commented 1 year ago

But if I understand you correctly, it would be up to the UA whether to replace the entire contents or just add something new, right? So the code editor in your example could work in Safari on Mac in a way where it would just suggest an entire code snippet to the user and then replace everything else in there, whereas on in Edge on Windows it may give suggestions for specific terms to be used within the code editor? If that is the case, who would opt for using this feature that is only working sometypes as required for some users rather than use one of the existing JavaScript code editors with an existing auto-complete plugin that work all the time and everywhere?

I'm thinking maybe the usecase for this is something else, such as an address input on a simple form field where the user wants to use a contenteditable element instead of a textarea for some reason - maybe because there are situations where there could be richtext in the address? That then carries with it the issues mentioned by @battre above.

[1] https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

The scenarios for autocomplete on contenteditable are primarily about writing assistance, not form fill. I've updated the use cases section in the explainer, hopefully that helps. For the use cases I can imagine, they all seem to be about inserting content where the user is typing (replacing the user's selected text if necessary). My reasoning for leaving the decision about how to insert content into the DOM up to the UA is that autocomplete is about browser-powered functionality and it is unclear what browsers might come up with in the future. The intent is that regardless of what types of writing assistance UAs add, authors should be able to control it with the autocomplete attribute. The existing spec also does not prescribe how exactly autofill should insert content into the DOM, it just mentions that UAs must act "as if the user had modified the control's data". This allows a wide range of use cases to be supported.

johanneswilm commented 1 year ago

The scenarios for autocomplete on contenteditable are primarily about writing assistance, not form fill.

Ok, that makes more sense. So it looks like you are planning for a scenario where a UA or an operating system is providing something like text completion using a large language model (LLM) either by directly completing the text or by figuring out that this would be an appropriate place to insert the users phone number, credit card number or similar and then offering that as something to easily fill in.

I can see how it would make sense to signal to either the UA or browser extensions (like Grammarly and new similarly IA-based offerings) that this is a field where such assistance would be desired or it should not be offered. This kind of assistance is qualitatively different from spell-checking, so therefore you need to have two distinct keywords.

I wonder then, given that this usage is quite different from the form-filling help that the autocomplete attribute offers, whether it would not make sense to use a different term to make it less confusing. Maybe something like textcompletion?

I also think it should be made clear which input type (before input event and input event) will be used for this. There is called called insertReplacementText and the usage is described as "replace existing text by means of a spell checker, auto-correct or similar". I can see from the name of it that it was initially meant to be used for spell- and grammar checkers, but it would still seem like the most appropriate one. Else, maybe we need to add another type to the chart.

Under these circumstances, would it not also make sense to add this to EditContext in parallel? I have seen your notes on that, but the use cases you like, like a Facebook editor, already use a sophisticated and highly complex contenteditable based-editor that will also possibly be replaced by EditContext once shipped.

johanneswilm commented 1 year ago

From the explainer:

Many sophisticated editors that could benefit from the EditContext API also integrate their own writing assistance features and thus may opt out of browser-powered autocompletion (ex. Google Docs, Word Online). Therefore, it is unclear whether supporting the autocomplete attribute on EditContext editable hosts will be useful.

Most production level JS richtext editors on the web are quite sophisticated and will consist of thousands of lines of code and have 5-20 years of development behind them. However, a lot of these can be run completely in JavaScript (open source libraries such as CKEditor, ProseMirror, TinyMCE, etc.). And most of the more robust ones already do what EditContext promises in that they diff the dom after browser-initiated DOM changes and then potentially roll back some of those. Switching to EditContext will in many such cases mean a simplification of the code as one can skip diffing and rolling DOM changes back. So if and when EditContext actually ships eevrywhere, I would think that a lot of these libraries will eventually switch to it.

However, hosting a LLM on a server is a bit more complicated than serving a JS-based editor on a website. I would therefore think it makes a lot of sense to add both spell checking and this new feature also to EditContext. That would also be consistent with other decisions you made, such as adding using the native selection as an option to EditContext even though Google Docs and other larger online word processors don't make use of it, precisely because it is also to be useful for smaller sites.

johanneswilm commented 1 year ago

I have proposed to add this to the agenda of the Web Editing Working Group at TPAC.

mfreed7 commented 1 year ago

There are two parts to this proposal, and one is less documented than the other. The first is a proposal to change autocomplete to be a global attribute that can be used on any element type. I understand that part. The second, which is not documented in the explainer (unless I missed it?), is about what values the autocomplete attribute may have when it is used on a non-form field. It sounds like the intention for now is simply to allow autocomplete=off to disable UA behaviors. Is that correct? Or are you proposing to allow all of the existing autocomplete values (e.g. autocomplete="street-address")? Or are you proposing new values entirely (e.g. autocomplete=suggestions)?

sanketj commented 1 year ago

I wonder then, given that this usage is quite different from the form-filling help that the autocomplete attribute offers, whether it would not make sense to use a different term to make it less confusing. Maybe something like textcompletion?

This is one of the alternative solutions mentioned in the explainer: https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md#text-prediction-attribute. The main downside with something like textprediction or textcompletion is that it may not cover future use cases. Ex (completely hypothetical): A browser may decide to ship an in-built meme generator in the future. In this case, the autofill suggestions would be images instead of text. In fact, even the input could be more than just text as well, if the UA wanted to allow the user's own images to be turned into memes. This proposal should be easily extensible to cover such future scenarios, and we wouldn't want to create a new attribute each time.

The autocomplete attribute feels like a good fit because it is already used today for "UA-driven autofill". In addition, there is an established pattern with details tokens for authors to hint to the UA on the type of autofill that is desired. So, if needed in the future, it would be easy to support something like autocomplete='text-suggestions' or autocomplete='image-suggestions'.

sanketj commented 1 year ago

I also think it should be made clear which input type (before input event and input event) will be used for this. There is called called insertReplacementText and the usage is described as "replace existing text by means of a spell checker, auto-correct or similar". I can see from the name of it that it was initially meant to be used for spell- and grammar checkers, but it would still seem like the most appropriate one. Else, maybe we need to add another type to the chart.

Yes, I agree. Perhaps the insertText input type[1] should be used if it is purely an insertion scenario, such as these use cases, and insertReplacementText should be used if any content (ex. the user's selection) is also being replaced?

[1] https://www.w3.org/TR/input-events-1/#interface-InputEvent-Attributes

sanketj commented 1 year ago

From the explainer:

Many sophisticated editors that could benefit from the EditContext API also integrate their own writing assistance features and thus may opt out of browser-powered autocompletion (ex. Google Docs, Word Online). Therefore, it is unclear whether supporting the autocomplete attribute on EditContext editable hosts will be useful.

Most production level JS richtext editors on the web are quite sophisticated and will consist of thousands of lines of code and have 5-20 years of development behind them. However, a lot of these can be run completely in JavaScript (open source libraries such as CKEditor, ProseMirror, TinyMCE, etc.). And most of the more robust ones already do what EditContext promises in that they diff the dom after browser-initiated DOM changes and then potentially roll back some of those. Switching to EditContext will in many such cases mean a simplification of the code as one can skip diffing and rolling DOM changes back. So if and when EditContext actually ships eevrywhere, I would think that a lot of these libraries will eventually switch to it.

However, hosting a LLM on a server is a bit more complicated than serving a JS-based editor on a website. I would therefore think it makes a lot of sense to add both spell checking and this new feature also to EditContext. That would also be consistent with other decisions you made, such as adding using the native selection as an option to EditContext even though Google Docs and other larger online word processors don't make use of it, precisely because it is also to be useful for smaller sites.

This is good feedback, thanks. I do agree that spellcheck, and with this proposal autocomplete, are potential gaps in the EditContext API. I'll follow up and share additional details on our thinking here. cc: @dandclark @alexkeng

sanketj commented 1 year ago

The second, which is not documented in the explainer (unless I missed it?), is about what values the autocomplete attribute may have when it is used on a non-form field. It sounds like the intention for now is simply to allow autocomplete=off to disable UA behaviors. Is that correct? Or are you proposing to allow all of the existing autocomplete values (e.g. autocomplete="street-address")?

UAs should continue to respect autofill details tokens even in non-form fill scenarios since these are ways for authors to filter to the type of autofill that is desired. So if the author used autocomplete='street-address', the UA should only provide address suggestions. Similarly, as variations of this example, the UA would suggest phone numbers only if the author set autocomplete='tel' or emails only if the author set autcomplete='email'.

Or are you proposing new values entirely (e.g. autocomplete=suggestions)?

New values are not currently being proposed, but that would be how I see this evolving. (see this related comment). Since writing assistance scenarios like text predictions are not supported with form fill, we will need new tokens for authors to hint about these new types of autofill.

On the other hand, it might be good practice to add a new token whenever a browser introduces a new type of autofill, in which case perhaps a token like suggestions or text-suggestions should be introduced for these use cases. Curious to hear other perspectives on this.

johanneswilm commented 1 year ago

The main downside with something like textprediction or textcompletion is that it may not cover future use cases. Ex (completely hypothetical): A browser may decide to ship an in-built meme generator in the future. In this case, the autofill suggestions would be images instead of text. In fact, even the input could be more than just text as well, if the UA wanted to allow the user's own images to be turned into memes. This proposal should be easily extensible to cover such future scenarios, and we wouldn't want to create a new attribute each time.

The issue you are having here is with terms that include the term "text", right? How about picking a term that does not include "text"?

So this will initially be plaintext and then in the future could also produce other markup. Images will be added inline then? Or how do you communicate to the editor that there is an image? If it can contain media, then maybe a way to do it would be to make it a type of paste with a DataTransfer object.

The autocomplete attribute feels like a good fit because it is already used today for "UA-driven autofill". In addition, there is an established pattern with details tokens for authors to hint to the UA on the type of autofill that is desired. So, if needed in the future, it would be easy to support something like autocomplete='text-suggestions' or autocomplete='image-suggestions'.

Maybe I don't fully understand, but it sounds to me like a very different feature than what autocomplete is today, such as:

Current autocomplete:

Is tied to form input.
Only works for simple input (plaintext/select).
Will replace the entire value of the element.

The autocomplete function mentioned in this proposal:

Is not related to form input.
Is meant to be used for complex richtext content.
Will replace parts of or add to the already existing value/contents of the element.

Is that correctly understood? So while you may want to specify details on where the input comes from, none of the existing detail tokens will be useful for what you are trying to achieve, correct? And how will you specify, given the current syntax, that the autocomplete is to provide both image and text suggestions? And maybe you additionally need to specify that this contenteditable field is to use "casual college student style" (or some such thing) to give the LLM a better idea about what kind of text it is to produce?

Perhaps the insertText input type[1] should be used if it is purely an insertion scenario, such as these use cases, and insertReplacementText should be used if any content (ex. the user's selection) is also being replaced?

Don't worry about the name of types. The important part are the situations they are to be used in according to the specification. "insertText" is defined as to be used for "insert typed plain text". Given that this isn't text that is being typed, that does not seem like the right one. "insertReplacementText" is to be used for every type of text that originates from "a spell checker, auto-correct or similar". All the types can be used for both replacing existing content and for inserting entirely new content. The point of using the different types is to let the JS editor app know where the text comes from so that it can react differently to it.

For example, a school writing app may allow aid from LLMs, but requires the purely LLM-generated text to be marked in some way. College professors in some places are currently making such requirements, but it's very difficult for students given the current tools to keep track of the parts that are AI-generated. This sort of marking requirement might even become part of legislation in some places with new AI legislation.

That being said, when we wrote this, we had not anticipated that "a spell checker, auto-correct or similar" would create entirely new text without replacing something else, which is why the term "insertReplacementText" was chosen. Solutions for this could be to either add a new term for this kind of content (for example "insertFromGenerator" to also accommodate future options of inserting other types of content) or to simply change the description of "insertReplacementText" to clarify that it is also to be used when adding new text without replacing existing content.

[1] https://www.w3.org/TR/input-events-1/#interface-InputEvent-Attributes

sanketj commented 1 year ago

Maybe I don't fully understand, but it sounds to me like a very different feature than what autocomplete is today, such as:

Current autocomplete:

Is tied to form input.

Only works for simple input (plaintext/select).

Will replace the entire value of the element.

The autocomplete function mentioned in this proposal:

Is not related to form input.

Is meant to be used for complex richtext content.

Will replace parts of or add to the already existing value/contents of the element.

Is that correctly understood? So while you may want to specify details on where the input comes from, none of the existing detail tokens will be useful for what you are trying to achieve, correct? And how will you specify, given the current syntax, that the autocomplete is to provide both image and text suggestions? And maybe you additionally need to specify that this contenteditable field is to use "casual college student style" (or some such thing) to give the LLM a better idea about what kind of text it is to produce?

I agree that these differences are substantial and autocomplete would work quite differently in a form control than in an editable region. The main appeal to use autocomplete is that there is an existing mechanism for details tokens that could be re-used, and the name fits reasonably well. That said, I see limited precedent for re-using attributes in this way, so I'm open to introducing a new attribute instead if we believe that's better. Curious to hear other perspectives on this.

sanketj commented 1 year ago

Don't worry about the name of types. The important part are the situations they are to be used in according to the specification. "insertText" is defined as to be used for "insert typed plain text". Given that this isn't text that is being typed, that does not seem like the right one. "insertReplacementText" is to be used for every type of text that originates from "a spell checker, auto-correct or similar". All the types can be used for both replacing existing content and for inserting entirely new content. The point of using the different types is to let the JS editor app know where the text comes from so that it can react differently to it.

For example, a school writing app may allow aid from LLMs, but requires the purely LLM-generated text to be marked in some way. College professors in some places are currently making such requirements, but it's very difficult for students given the current tools to keep track of the parts that are AI-generated. This sort of marking requirement might even become part of legislation in some places with new AI legislation.

That being said, when we wrote this, we had not anticipated that "a spell checker, auto-correct or similar" would create entirely new text without replacing something else, which is why the term "insertReplacementText" was chosen. Solutions for this could be to either add a new term for this kind of content (for example "insertFromGenerator" to also accommodate future options of inserting other types of content) or to simply change the description of "insertReplacementText" to clarify that it is also to be used when adding new text without replacing existing content.

[1] https://www.w3.org/TR/input-events-1/#interface-InputEvent-Attributes

Thanks for the clarification on insertText vs. insertReplacementText. Updating the description for insertReplacementText and using that for the input type sounds reasonable to me.

dandclark commented 1 year ago

This is good feedback, thanks. I do agree that spellcheck, and with this proposal autocomplete, are potential gaps in the EditContext API. I'll follow up and share additional details on our thinking here. cc: @dandclark @alexkeng

Rather than adding a parallel version of spellcheck, autocomplete, etc. to EditContext, I think EditContext can just use the value of these attributes that are set on the element currently being edited in the EditContext-associated subtree. In other words the attributes would work the same way as with contenteditable. This is simpler since developers can continue using the attributes in the way they're used to, and we don't need to make changes to the EditContext API whenever a change is made on the corresponding HTMLElement attribute (like adding a new a new details token).

If we were to add these to EditContext then we can also have some confusing contradictions. For example should spellcheck be enabled in this case?

let editContext = new EditContext();
let div = document.createElement("div");
div.editContext = editContext;
editContext.spellcheck = true;
div.spellcheck = false;
document.body.appendChild(div);

Using only the value on the div avoids such contradictions.

The same reasoning applies to certain other global attributes like enterKeyHint and inputMode.

sanketj commented 1 year ago

Summarizing points of feedback from above to support discussion at TPAC:

Extending the autocomplete attribute to editing hosts vs. creating a new attribute
Supporting this attribute on EditContext
Supporting additional attribute values for writing assistance scenarios
Type of input events that should be fired (insertText vs. insertReplacementText vs. something else)

sanketj commented 1 year ago

Discussion from Editing WG TPAC meeting:

Sanket: Edge has usecase where we want to provide browser provided writing assistance. Similarly behavior that Grammarly and sites provide themselves. So we need a way to add a way of turning it off if the web author wants to add something custom. Autocomplete is currently used in forms, and we need something similar. We need something that can show which kind of autocomplete we want. Simon: I think it could make sense to add form-fill to contenteditable. It seems like this would be a different thing. It would confuse web authors. Could also apply to input elements/textarea. Ryosuke: So two features? Simon: Yes Johannes: The fields in the form are just that specific fields contain specific information. Here it contains many more fields. Sanket: Could enable multiple and specify what should go in there as well as have a on/off. We can take another name. Ryosuke: we have autocapitalize and autocorrect that will fix small spelling errors. Johannes/Sanket: event should work the same way for EditContext as for contenteditable Sanket: InsertText or insertReplaementText Johannes: insertReplacementText or a new value that says it comes from generative AI. Dan/Sanket: insertReplacementText seems like the right value as it’s quite similar. Simon: On Android? Will it be composition or insertReplacementText? Johannes: What if you want to insert images in the future? Sanket: If we use more than text, then we probably need a new input type. Dan: if we only had composition, we could go with that. But insertReplacementText seems closer semantically. RESOLUTION: We use insertReplacementText as input type RESOLUTION: It should work the same on EditContext editing hosts as spellcheck works. Dan: How about “autosuggestions” Johannes: Why “auto”? “Auto” seems to be a prefix for things that happen automatically. Dan: Good points, maybe “suggestions”? Johannes: “input-suggestions” Dan: sounds good RESOLUTION: We bikeshed offline. Examples “input-suggestions”, “writing-suggestions”, “autosuggest” with values like “on” and “off” Simon: on/off or true/false … Ryosuke: autocorrect/autocapitlaize use on/off. That’s status quo. If we want to be consistent, then let’s do that. Simon: Value in consistency.

sanketj commented 1 year ago

The recommendation from the Editing WG meeting is to create a new attribute, with on or off values. Some options for names for the new attribute are:

autosuggest
inputsuggestions
writingsuggestions

Any preferences? cc: @domenic @annevk

johanneswilm commented 1 year ago

TPAC 2023: RESOLUTION: We use insertReplacementText as input type RESOLUTION: It should work the same on EditContext editing hosts as spellcheck works. Name: see above, autocomplete seems confusing.

marcoscaceres commented 11 months ago

@sanketj, please ping @annevk and me when you have a PR.

marcoscaceres commented 11 months ago

Some other suggestions:

textprediction
typingassist
textsuggestions

sanketj commented 11 months ago

Per resolution in #9966, Microsoft can draft up a spec PR for the new attribute. Unless there are strong objections, I plan to start with writingsuggestions as the name and on/off as values. Happy to continue discussions on the final name.

sanketj commented 10 months ago

Thanks @mfreed7, @marcoscaceres, @domenic for the feedback on #10018. Calling out a few points about the new attribute that came out of that, which might be worth discussing in more detail during the next WHATNOT meeting.

true (enabled) by default on all elements
inheritance works across shadow boundaries
string type for getter/setter

sanketj commented 10 months ago

Per discussion on the above points during today's WHATNOT call, the only suggested change to #10018 was to align the inheritance behavior with spellcheck, and address inheritance across shadow boundaries for both attributes (and possibly others) separately. I've updated that PR accordingly. Please let me know if there's additional feedback on that one.

zcorpan commented 9 months ago

This should define interaction with field-sizing CSS property, if writing suggestions can appear inline and contain potentially privacy-sensitive information. See https://github.com/whatwg/html/pull/9903#discussion_r1475378234

annevk commented 9 months ago

@sanketj Why does it work for type=email but not type=telephone?

dandclark commented 9 months ago

@sanketj Why does it work for type=email but not type=telephone?

I guess the principle of the current list is "types that expect character input that's not numbers-only, excepting password".

dandclark commented 9 months ago

This should define interaction with field-sizing CSS property, if writing suggestions can appear inline and contain potentially privacy-sensitive information. See #9903 (comment)

@zcorpan

The scope of this new attribute just grants authors the ability to turn off UAs' writing suggestions capabilities. An attempt hasn't been made to standardize the details of those capabilities, and they could vary widely. So while I agree that the interaction of field-sizing with UA writing suggestions and autofill needs to be worked out, it's not really in the purview of the writingsuggestions toggle attribute defined here.

Maybe worth a new issue?

annevk commented 9 months ago

@dandclark it seems weird to me to include email but not telephone. If this feature can suggest email addresses, surely it can suggest telephone numbers as well.

dandclark commented 9 months ago

@annevk @sanketj pointed out to me that the list of supported elements was originally taken from element types of the spellcheck attribute under User agents must only consider the following pieces of text as checkable for the purposes of this feature.

I think that list is a reasonable starting point, and given the current design in https://github.com/whatwg/html/pull/10018 it could be expanded without breaking backwards compatibility, but I don't have any particular objection to adding "telephone".

whatwg / html

New attribute to control UA-provided writing assistance #9065