Closed scofalik closed 4 years ago
TBH, I think that we're not going to be able to work on schema in iteration 2. We don't know enough yet and we're not going to have that much time.
But it should be fine to work on it later, because it doesn't affect the code thaaat much. Ofc, each feature and many tests define the schema but it's not critical for feature development, unless we face some issues. But this will be the exact moment when we should start thinking on the schema.
PS. So far the schema isn't even used by the existing features for things other than registering elements.
Multiple features/parts of code uses Schema
directly or indirectly:
view-converter-builder
uses Schema
to filter-out incorrect DOM/View structure. This will be used by many features, though it is defined just in one place.Schema
to check if text can be placed at given position.LiveSelection
checks Schema
to see if selection can be placed at given position.AttributeCommand
uses Schema
to filter out part of the range on which attribute cannot be applied.link
and image
features will be probably first features that we will work on after iteration 2 and 3 so we will need working Schema
pretty soon. Since iteration 2 is about bringing stable API for smooth feature developement, I thought it is a good moment to start working on it. I leave decision up to you, but we will have to either do it in iteration 2 or at the beginning of iteration 3. I'd see how iteration 2 will go and decide at the end of it whether it's ok to include Schema
refactor in it.
BTW. Other features also probably should use Schema
, i.e. to, at least, disable itself if selection is in "incorrect" position, etc. The thing is that we forgot about it or didn't care at the moment. So we could work more globally on schema in upcoming iterations - first we should improve Schema
itself and then look at the features and decide what is missing. The problem is that the later we will do it, the more tech debt we will have.
It's not said that we're not going to refactor any code after iteration 2. Until 1.0 we're going to rewrite some pieces many times still :). Schema can be one of them.
I agree that it's an API used in many places. That makes this a harder decision. But I'd postpone any bigger refactoring of it until we have more use cases in hands. We don't know much more now about what features will need to define than we knew few months ago when we implemented it initially. So there's a high chance that we'd have to repeat refactoring in a couple of months.
Let's maintain the concept that we've got right now for couple next iterations and let's get back to it with better image of what we need. This means that if link or image features will miss some features in the current implementation we shouldn't immediately try to refactor the whole code.
I agree with @Reinmar. We will not freeze the API after iteration 2. Also, I do not think we have a clear understanding what schema features we need yet. We can introduce them as soon as we will need them.
Another case to keep in mind: https://github.com/ckeditor/ckeditor5-headings/issues/27#issuecomment-245559218.
Two new things to remember:
<image>
or other widgets, are atomic objects which should be handled differently than other blocks.check( { name: ... } )
with a node I need to convert it to a name and this requires additional logic if the node can be a text.Another thing – SchemaPath
is confusing. You can pass there an element, but it must be wrapped in an array. But the same isn't true for a position... So, when checking if foo is allowed inside some element I must wrap it myself.
And I found one more thing. Perhaps it's my mistake but if I don't get it then there's something wrong with the API:
this.schema.check( { name: 'paragraph', inside: [ 'paragraph' ] } ); // -> false
this.schema.check( { name: 'paragraph', inside: position } ); // -> true
position.parent.name; // -> 'paragraph'
Another thing. I was wondering why list items are filtered out. It turned out that checking whether a specific element can be inside some context is tricky because this isn't enough:
schema.check( { name: el.name, inside: ctx } );
You also need to pass this element's attributes to the check()
function.
Poor Schema
. It must feel really bad and low reading all your negative feedback.
Especially, taken that it knows that it's here only temporary :P
One more thing:
const attributes = ( node instanceof Element ) ? node.getAttributes() : undefined;
return this.schema.check( { name: this._getNodeSchemaName( node ), attributes, inside: path } );
And I've got e.g. 10 tests failing.
const attributes = ( node instanceof Element ) ? node.getAttributes() : null;
return this.schema.check( { name: this._getNodeSchemaName( node ), attributes, inside: path } );
And I've got 20 tests failing.
Very curious :D
I'm now working on https://github.com/ckeditor/ckeditor5-engine/issues/732 because it turned to be a necessary improvement for autoparagraphing feature. After strengthing the schema check algorithm I need to fix now dozens of tests which (usually mistakenly) relied on weak schema checks.
But this made me think that many of these tests are not interested in testing the schema anyway. Many should, like the entire delete content, insert content, data loading, parsing, stringifying, etc. What about modify selection for instance? Should it configure schema or can we introduce "allow all" for it? First I thought that it doesn't matter in this case. But then, how reliable the tests will be if we mock the schema ? Then the text will be allowed in every context, making even such algorithms like this vulnerable (modify selection may need to check whether where it wants to move the caret is a correct text position). So, after giving it a second thought I came to a conclusion that we should not use "allow all" in our tests.
We'll need, however, to introduce some more useful syntax for configuring schema, because now you need to call registerItem()
and allow()
for every element you introduce and often more than once.
So, after giving it a second thought I came to a conclusion that we should not use "allow all" in our tests.
Agree. We could have really unit tests wich check only a single feature, but I think that more real scenarios are more useful (as long tests are not too complex and not too hard for the debugging).
At the same time note that we will need something like "allow all", for instance, the collaboration server may need such feature.
I copy paste from https://github.com/ckeditor/ckeditor5-engine/pull/733/files#diff-f6cdfa67127a7d5095832bc5a9b5dec0R572
Okay it took me a while to analyze this. At first sight it seemed wrong to me that any path can be shorter. For checkPath
= C D E
both allowedPath
= A B C D E
and allowedPath
= D E
are right. This rises a question whether allowedPath
should be matched in whole? In first allowedPath
it seems that somebody explicitly want to set that E
is allowed in A B C D
, that is that one of ancestors is A
.
OTOH, this will not be a problem if in checkPath
we will always pass a "full path", that is a path that has $root
as left-most element. This would "solve" the "issue" described above. Consider again:
checkPath
= $root C D E
, allowedPath
= A B C D E
and allowedPath
= $root A B C D E
.
Now, check will fail for both allowedPath
s (which allows us to use shorter and simpler form without $root
).
And now back to the original problem. If we suppose that checkPath
always has $root
as first element, a check should fail for every checkPath
shorter than allowedPath
. That is because there will be a moment when we will check $root
from checkPath
against non-$root
from allowedPath
.
So there are a few solutions:
checkPath
has to be full. Drawback: it's prone to generate bugs. OTOH, I expect that in most cases, inside will be specified as model.Position
.$root
as first element to checkPath
if first element is not $root
. Drawback: this may backfire if, for a reason, really want to check a path that starts from something else than $root
.false
if checkPath
is shorter than allowedPath
. This feels natural. It is impossible to match every item from registered allowedPath
which require, that the last item in allowedPath
is exactly in those parents. Drawbacks: IDK?I've been thinking about the message quote feature today and couple of related things – like retrieving selected blocks.
This made me think about the schema. In order to identify block-like elements (paragraphs, list items, headings, etc.) we need to mark them in the schema. Currently, they all inherit from the $block
type, but this isn't very convenient. If you'd like to introduce a new element which can't contain the same elements as $block
or can be allowed in different elements.
So, we should separate the structure from element types. We can have default elements like $block
, but they need to be marked to be of a block
type.
I've been also thinking what kind of elements we need. I can list at least these (but I would keep this list open):
paragraph
, listItem
,caption
, tableCell
,image
, other widgets.An element can be of many types at the same time. So, e.g., you'll have block limits (as in CKE4's blockLimit
). You may have inline nested editables (which will be inline+limit), etc.
Finally, I need the "get selected block elements" function and I wonder where to put it. It needs to accept a selection and know about the schema. It coooould be the schema's method perhaps or selection's or one of the controller's, but maybe you have better ideas?
Special question to @scofalik – where's list feature's "get selected blocks" logic :D?
And the same question to @szymonkups – where's the same piece of logic in the heading feature :D
https://github.com/ckeditor/ckeditor5-list/blob/master/src/utils.js#L34
It checks schema, whether given element is a $block
. It's not nice and wasn't meant as final solution, ofc.
I agree with what you've written down. Structure "flags" will be also kept/connected with schema? I mean they will be specified at the same time and place as registering element name to schema?
I agree with what you've written down. Structure "flags" will be also kept/connected with schema? I mean they will be specified at the same time and place as registering element name to schema?
Yes.
Okay, I had a doubt that we will describe one entity in three places (creating ModelElement, creating schema item and describing structure in other place).
Schema attributes make sense for me (it's like meta elements attribute, what fit well to the tree structure of schema).
My only doubt is the inline
attribute. I think we should introduce only these attributes, we are sure, we need now. But, as you said, the list should be open, so we will be able to introduce more in the future.
And the same question to @szymonkups – where's the same piece of logic in the heading feature :D
It will be this: https://github.com/ckeditor/ckeditor5-heading/blob/f0c19177afc4679349f0713a21bd9b8ba5c49378/src/headingcommand.js#L94-L95
And I created an issue to extract this logic: https://github.com/ckeditor/ckeditor5-engine/issues/811. For now, I'll base it on the $block
type checks.
Special question to @scofalik – where's list feature's "get selected blocks" logic :D?
In should go to the Selection
class.
We could think also if it would be possible to store context for block
, inline
, limit
and object
elements. For example: we can define caption
element that is used only inside image
element. So theoretically we would want to define caption
as limit
element only when inside image
.
I'm thinking now how the message quote feature should get elements which can be wrapped with a <messageQuote>
element. I implemented recently the Selection.getSelectedBlocks()
method... to realise that the quoting mechanism differs from heading and lists.
Let's consider this content:
<paragraph>x[x</paragraph>
<image></image>
<paragraph>y]y</paragraph>
The heading and list commands will apply themselves to the two paragraphs but will ignore the image element.
The message quote feature should, however, wrap the image too. This made me think that for the quote feature all the 3 elements are the same kind of content – wrappable blocks.
We have a couple of options how to implement those two scenarios, but to keep this message short I'll only mention the one I like the most.
In the schema, I'd mark the <image>
as a block too so it gets returned by getSelectedBlocks()
. From content editing perspective these are all editable blocks of content. However, since it's also marked as an object, which means that it's a self-contained entity, it should be filtered out by the list and heading features.
I'm going to continue implementing the quote feature with the above assumption. The image cannot be marked as a block now, so it won't be quotable for the time being.
The check()
method's interface doesn't allow us to check the rules easily and in a realistic way. E.g., I just wrote this:
const isMQAllowed = schema.check( {
name: 'messageQuote',
inside: Position.createBefore( firstBlock )
} );
const isBlockAllowed = schema.check( {
name: firstBlock.name,
inside: 'messageQuote'
} );
It's part of a logic checking if a block can be wrapped with a message quote. What's wrong here? That in the second check I'm not checking whether the real block I have can be wrapped with a message quote in the exact content in which it is now. I'm checking only its name and context-less message quote.
Schema was meant to be context-aware and this information must be provided and used by schema checks.
EDIT: Actually, the second check must also include attributes:
const isBlockAllowed = schema.check( {
name: firstBlock.name,
attributes: Array.from( firstBlock.getAttributeKeys() ),
inside: 'messageQuote'
} );
In the schema, I'd mark the
as a block too so it gets returned by getSelectedBlocks(). From content editing perspective these are all editable blocks of content.
Yep, I've just confirmed this in a different way. Currently, the image doesn't extend $block
so it's not allowed in a quote at all (because quote accepts only $block
s inside itself).
PS. To make it clear why we can't quickly make image
inherit from $block
:
getSelectedBlocks()
. image
inherits from $block
, it can also redefine what's allowed inside itself. $block
allows any $inline
and features can extend that list. The image feature must be able to disallow everything except the caption
element.Great news :) Another issue with the Schema: https://github.com/ckeditor/ckeditor5-list/issues/34
It's not possible to allow element attributes on a registered element (which inherits from another element).
I'm going to workaround this in the messagequote feature by allowing the list item inside it explicitely, but this is an ugly hack which we must removed ASAP.
Schema is not flexible enough. ATM, there is a bug in our features:
// Allow link attribute on all inline nodes.
editor.document.schema.allow( { name: '$inline', attributes: 'linkHref' } );
the author just wanted to inform schema that $inline
elements can have linkHref
attribute, but by not providing inside
entry, accidentally $inline
elements with linkHref
are allowed everywhere.
This means that correct solution is:
editor.document.schema.allow( { name: '$inline', attributes: 'linkHref', inside: '$block' } );
But the problem is that this doesn't go well with other features that are expanding $inline
element. If other feature says that it allows $inline
inside myCustomElement
:
editor.document.schema.allow( { name: '$inline', attributes: 'linkHref', inside: 'myCustomElement' } );
But then the $inline
element won't be allowed in myCustomElement
with linkHref
. It means that features need to know about each other which is against our principles.
Schema is not flexible enough. ATM, there is a bug in our features:
// Allow link attribute on all inline nodes. editor.document.schema.allow( { name: '$inline', attributes: 'linkHref' } );
the author just wanted to inform schema that $inline elements can have linkHref attribute, but by not providing inside entry, accidentally $inline elements with linkHref are allowed everywhere.
Schema is not clear enough.
This will be fun. We gathered a huge list of things to consider. Most of them didn't even cross our minds before. Designing the new schema will be really challenging :)
Things got seriously bad ;<
First, we had a problem that we had attrs defined like:
editor.document.schema.allow( { name: '$inline', attributes: 'linkHref' } );
This is incorrect because this allows $inline[linkHref]
everywhere. So we needed to change them to:
editor.document.schema.allow( { name: '$inline', attributes: 'linkHref', inside: '$block' } );
Then, it turned out that this disallows links/bold/italic in image caption becuase caption
doesn't inherit from $block
... So we did this: https://github.com/ckeditor/ckeditor5-image/pull/95/files.
Which... is totally wrong ;| It means that caption
can be used directly in $root
and blockQuote
(which lead to https://github.com/ckeditor/ckeditor5-block-quote/issues/10).
So... we knew that 💩 was deep. Now it also got serious.
I don't see a simple way out of this ATM. I can path block quote so it doesn't apply to captions because blockQuote
is not allowed in image
. However, this is stepping on a thin ice.
I've been thinking a bit about the use case where you want to allow image
where blocks are allowed but you don't want to allow inside things which are allowed in blocks.
Currently, if you inherit from $block
you get both – image
will be allowed in $root
(which is ok), but $inline
will be allowed in image
(which is not ok).
So, perhaps the way to go will be to split these aspects so you can define that image
is allowed where $block
is (it inherits this behaviour from $block
) but you can define what's allowed in image
separately.
This will also allow defining that caption
inherits "allowed content" from $block
but "allowed in" is set to image
only.
Going a step further, we've been also thinking about basing schema API on an object format. From the top of my head:
schema.register( 'image', {
allowedIn: schema.get( '$block' ).allowedIn,
allowedContent() {
return [ 'caption' ];
}
} );
I'm worried, though, how will this API allow runtime extensibility. This is, developers must be able to either redefine the default schema or make slight adjustments to it.
E.g. someone may want to allow paragraphs and other blocks allowed in caption
. Or to disable a certain attribute in captions. Theoretically, object format's advantage here will be that you operate on a fully transparent code, so you could do:
schema.get( 'caption' ).allowedContent = schema.get( '$root' ).allowedContent;
However, you could not do "disallow links in captions" because links are allowed on $inline
and so far this API doesn't allow defining "disallow link attribute on $inline
when in caption
". I'm not sure how it could be done without significantly complicating this concept. One idea which came to my mind some time ago was that when requesting schema, it could pass the query through the entire chain of elements.
So, when requesting whether link
attrs is allowed on $inline
which is in caption
which is in image
and so on, this would be a result of calling:
schema.get( 'image' ).allowedAttributesInContext(
[ 'caption', '$inline' ],
schema.get( 'caption' ).allowedAttributesInContext(
[ '$inline' ],
schema.get( '$inline' ).allowedAttributes()
)
);
Alternatively, context could be passed to schama.get( '$inline' ).allowedAttributes()
. This might be simpler but it would mean that it wouldn't be caption item which controls what's allowed inside it because the extension would be done on a level it knows nothing about ($inline
item).
To make it clear – when checking if foo
is allowed in bar
we'd need to check that both conditions are true:
foo
can be in bar
bar
allows foo
insideI wonder if this won't be too demanding in terms of how many things you need to take care of to define a simple thing. If so, we could introduce small helpers which'd simplify the job. E.g. Schema.allow( 'a' ).inside( 'b' )
which would take care of extending a's allowedIn
and b's allowedContent
.
Another thought – perhaps we can go the simple way for the declarative part of the schema. So, you'd be only able to define that foo
can be in bar
but not that foo
can be in bar > bom
. All kind of chains longer than 2 would need to be handled on the imperative level (by overridable methods, or methods with callbacks).
So, a schema item would have a simple, declarative allowedIn
and allowedContent
and allowedAttributes
(arrays) and getAllowedIn()
, getAllowedContent()
and getAllowedAttributes()
. Default implementations of these methods would just base on the properties, but you could override the method to implement some special rule.
We only need to figure out whether the most common scenarios can be done using the declarative API.
Which... is totally wrong ;| It means that caption can be used directly in $root and blockQuote (which lead to ckeditor/ckeditor5-block-quote#10).
For now, blockQuote
is the only non-root element that allows $block
s inside it right? Maybe for now it would be enough if we would just disallow caption in blockQuote
? I know this stinks because it's coupling two features... Or... I don't know :| You are right that we are in 💩
I've been thinking a bit about the use case where you want to allow image where blocks are allowed but you don't want to allow inside things which are allowed in blocks.
Finally I understand your idea :P. This may be a way to go but makes things complicated. Slowly we have complexity creep where you have to define more and more stuff. A "schema defining object" would have four "properties":
Schema
for the first time, of course I had less experience and no use cases for it, but I was hoping that proper "abstract" elements and inheritance will be enough so we won't need such splitting.To make it clear – when checking if foo is allowed in bar we'd need to check that both conditions are true:
- foo can be in bar
- bar allows foo inside
This seems overcomplicated and I don't really understand why?
Anyway, no matter what will be the final implementation of "schema building blocks", there will have to be some kind of controller that will use all "building blocks". This is why I think that this idea might be good too:
Another thought – perhaps we can go the simple way for the declarative part of the schema. So, you'd be only able to define that foo can be in bar but not that foo can be in bar > bom.
But are we gaining anything in terms of capabilities or is it just a gain in implementation simplicity?
I am sorry that my answer is not really insightful but this is a hard topic and it needs more focus than 15 minutes for response on GitHub. We really have to nail all the use cases we have, all the problems we encountered and how they would be resolved in solution A / B / C. This means that it will be really hard to come up with anything by discussing it on GitHub. Somebody have to gather all the data and start with the schema from scratch.
One more case: allow specific marker in some content/disallow it. For instance, I want to disallow creating comments in empty paragraphs. I may add an attribute with the same name as the marker to the schema and then check if such fake attribute is allowed. It might be the way to go, but it sounds a little like a hack. In general markers & schema is a topic we did not touch yet.
And one more case: https://github.com/ckeditor/ckeditor5/issues/477.
Schema stuff is complicated and we need to think about what tools we want to give to users and how they will use it.
A common problem is that after some changes in model, nodes that previously had attributes now are incorrect. For example, let's say that a heading cannot have a bold text. If you have a heading and then a paragraph with bold text, merging the paragraph into heading (using Backspace for example) should end up in removing the bold.
That was an easy case, though. But we enabled in Schema
so called "attribute groups", that is, the attribute is allowed on node only if the other is also set. After a change in model, if we will check such attributes one by one we will remove them even though the node is still in correct place. We would have to check every combination of attributes.
Then, we not only have to check attributes but also child-parent chains.
We not only have merging / moving but also RenameOperation
.
So, in fact, if something changes in model tree, there is awful lot of things to change. Honestly, I'd remove the possibility to define "attribute groups" because they are seldom used, can be controlled (and fixed) directly by feature and add unnecessary complication to schema-related algorithms.
Remember about attributes stored by DocumentSelection
: https://github.com/ckeditor/ckeditor5-engine/issues/1127
Also, recently we've been discussing whether schema should support required attributes set, that is whether there should be possibility to tell the schema, that given node is "okay" only when it has all of given attributes (for example, image
is fine if it has alt
and src
).
This may make some of algorithms too complicated and we agreed that if that will be the case, we will skip this functionality.
I had similar issue when I defined alignment
attribute on blocks as:
schema.allow( { name: '$block', attribute: 'alignment', inside: '$root' } );
I did this way probably due to misunderstanding of how schema works (to be honest I copied that from other plugin).
Anyway such definition prevented to add blockQuoute
on blocks that had alignment
attribute already set.
The other way around - setting alignment
attribute on blocks inside blockQuote
worked well (or not well depending how this should behave).
The configuration that worked for me was:
schema.allow( { name: '$block', attribute: 'alignment' } );
Another case - still I'm not 100% sure if valid or not. Given previous attribute
allow I'd like to check if adding atrribute
to some block is allowed:
schema.check( {
name: 'listItem',
attributes: [ 'alignment' ]
} );
will fail, wheras:
schema.check( {
name: 'listItem',
attributes: [ ...listItem.getAttributeKeys(), 'alignment' ]
} );
will not. It was counterintuitive for me to add current attributes to check but maybe it makes sense to pass them so schema can check all attributes (ie when some combination of attributes are not allowed). Probably a some kind of helper method for checking attribute on existing item might be handy.
I've read through the entire discussion and all related tickets and I edited the initial post with a compiled list of requirements, ideas and doubts. I'll be now working on figuring out how to deal with them. One of the most important questions I see there is what we can actually do and what needs to be rejected as too complicated.
All check()
use cases from the code (yep, it's sad but GH is buggy and can't display them):
Another bug due to the schema: https://github.com/ckeditor/ckeditor5-alignment/issues/12. It will be fixed as well.
For the future reference – these are notes I took when working on the new schema API. Let's hope that we'll never need to read through all this again :)
While analysing all use cases I recognised three main types of information which we try to retrieve from the schema:
The first group of checks can be derrived from the types of deltas. We also have some additional helpers (like removeDisallowedAttributes()
) and may add more.
Methods implementing these checks should be able to return true/false
and we need to be able to make decisions based on the return values. This means that when you get false
you should know that only one type of issue might've occured. E.g. that this node cannot be a child of its parent and you can't do anything about it. Situations that this child can't be a child of its parent, buuuuut... if you'd remove some attributes then maaaaybe must not occur. This means that checks will need to be limitted to a subset of information which we can retrieve from the tree.
I think that these checks can't care about the exact insertion positions because often we don't know where exactly the node will end up and what siblings it will have. We also don't know this node's children. E.g. during conversion, we need to make decisions about children nodes of some element before they are inserted into the tree and before their chilren are known. The information we can provide are: ancestors list (with their attributes), the node to insert (with its attributes, but without its children).
However, it doesn't also seem to be feasible and reasonable to include node's attributes in this check. Why? Because the checkInsert()
method will simply tell you that a node can't be inserted but not why. So if the decision would depend on an attribute you wouldn't know what to do next. Should I remove some attributes? Should I change their values? Is some attribute missing? Insertion check should only consider ancestors (with their attributes) and name of the element (or text).
So how would attribute validation look like? How to disallow bold in headings?
I think that the process should be two steps:
Note, that if you'd disallow bold in headings now, the current implementation of conversion would work fine because we do exactly what I described above – we first insert $text
(if it's allowed) and apply allowed attributes in the next step. But, insertContent()
would not handle this gracefully if we hadn't had https://github.com/ckeditor/ckeditor5-engine/blob/fc7da8035cdc9533f66ad5de66711e25cd2b385e/src/controller/insertcontent.js#L229-L234. And this piece of code proves even better that structure checks should not care about node-to-insert's attributes.
Example API:
const el = doc.createElement( 'paragraph', { alignment: 'right' } );
schema.checkInsert( position, element );
schema.checkInsert( position, 'paragraph' );
schema.checkInsert( position, '$text' );
schema.checkInsert( position.parent, element );
// Sometimes you may need to check a virtual tree.
const ancestors = [
{ name: '$root' },
{ name: 'blockQuote', attributes: { some: 'one' } }
];
schema.checkInsert( position.getAncestors(), element );
I'm unsure still whether it should be checkInsert()
because it indicates that it's only about insertion and sometimes you may want to check whether an already inserted node is correct. Therefore we can consider checkChild()
or other ideas.
Here we need two "can be inserted" checks and additional cleanup:
Both checks need to be be pefromed before any changes are done to the model, so the second one proves that the "can be inserted" check must be able to work with virtual trees. We could mock how the tree would look like after changes (similarly to what the differ would do) and that would be powerful, but it's an overkill. If we'll limit the "check can be inserted" checks to an array of ancestors (and set a rule that schema's callbacks can't check their siblings) and the element to be inserted, then we don't need such magic.
The problem here is that we also need to take care of attributes. If <paragraph align>
got wrapped with a <blockQuote>
and schema doesn't allow the align
attribute on a <paragraph>
when it's inside a <blockQuote>
then this attribute should be automatically removed. By whom? A feature or the writer? Post-fixers?
Another solution would be to limit attribute checks so they are not contextual (can't check ancestors chain). But that would make it impossible to prohibit bold in a heading, so it'd be a serious limitation.
<paragraph align>
or <listItem indent=0 type=bulleted>
?Here, we need to do a couple of things:
The check is simple but it again proves that the "can be inserted" check needs to work with virtual elements.
The attributes cleanup again opens a question who should do it. Should it be automatical or not?
A simple check, but opens a couple of related questions:
should we support groups of attributes?
No, because that would be against the rule that a check must concern a single trait and that we need to be able to make decisions based on those checks. Let's consider a case when on element <image>
there are two groups of attributes allowed: x
and y
, x
and z
. How would you clean disallowed attributes of <image a b c x y z>
? It's doable but it's not as simple as checking one attribute after another. You need to check permutations of those in some ways. And this complexity will leak from the schema to features. Besides, we haven't yet had a clear case for attribute groups which couldn't be solved differently (using two or more different elements, representing those separate use cases).
Image requires src
and alt
, list item requires indent and type. With src
and alt
the problem is that if those values weren't provided on V->M conversion, you can't guess them later on. There are no sane defaults there. So, it's basically a converters' job to not convert images when they don't have those attributes or provide proper placeholder (for alt
at least).
With list it seems to be a bit different. I thought that we can define that indent
defaults to 0
and type
to bulleted
. And if we'd implement automatic attributes validation we can also support attributes default easily. This would ensure that those attributes are always present which is valuable when implementing features.
However, 0
isn't indent
's default values in some cases. E.g. when you insert <listItem>
in a middle of a deeply nested list. Then, indent must be aligned to that list. Hence, it's again job for a post-fixer.
So, to sum this up – it doesn't make sense to have required attributes because in most cases those checks are very contextual.
should we support attribute value validation?
This seems feasible. The value of an attribute (name) is its inherent pair, so both things can be checked together if we can always provide the value of the attribute in all checks that we need to do. It seems so, because I can't imagine a situation when you want to check whether element may have an attribute but you don't know its value yet. The only situation where the value is not yet known is when user hasn't entered it yet. But then, the feature's UI need to validate that value anyway because we won't go so far to enable configuring validation messages in the schema.
should we support attribute exclusion?
The use case is e.g. linkHref
and linkName
. The question is whether this can't be handled by converters? It's tricky, because Link
and Anchor
features would be separate plugins, but perhaps one of them could consume the entire <a>
once handling it. So the first one would win if someone pasted <a href name>
.
So, can we imagine cases where converters could not handle it? Or, is there any risk of attributes getting applied to one piece of text or one element by features, directly on the model?
A separate question is whether we can handle exclusion at all. I think so – the "can have attribute" check will just return false
if the other attribute is already present. This will also work for the "remove disallowed attributes" when checking them one after another – simply, the first one will be removed and the second one kept. It means that this behaviour may seem to be non-deterministic.
Finally, the rules for which attribute has priority over which may not be so simple. Hence, post-fixers seem a safer place again. This may introducing some coupling between features (one feature will need to predict that some other feature may apply some attribute) but those cases will be rather rare.
This seems to be a common action which needs to be performed after most of model structure changes. We agreed already that this should be done by a post-fixer registered by the model itself. The advantage of doing this in a post-fixer is that it will be done once per all changes, not after every single atomic change (if it was done by the writer). This will be more performant and secure (because intermediate model states may be incorrect on purpose).
It's also important to remember that cleaning up attributes needs to be a recursive job – it needs to consider all descendants of changed nodes.
object
type.Currently, there are two correct selections:
Additionally, there are some rules which we may want to introduce in the future:
Similarly to the structure/attrs checks, we need to be able to query whether selection position is correct but also to fix incorrect selection (which is currently done by Document#getNearestSelectionPosition()
).
The biggest problem I can see here is what we should check. Always the entire selection? That will make the result hard to analyse (the problem with what false
means). But checking individual ranges may not be enough (validating that table selection is rectangular, validating the direction), and checking positions is even more hopeless.
I think that we again need to be smart and implement true/false
checks for a subset of information (and context) and allow post-fixing the rest.
TODO...
When designing this piece of API we need to take into consideration the following aspects:
While the first aspect is trivial, the second caused us a lot of pain in the past.
Functional requirements:
bold
on $text
or type
on listItem
.bold
on image > caption > $text
. I'm not sure about this because it makes it impossible/hard then to inherit all node's attributes because (clipboard holder's case).bold
in heading1 > $text
.paragraph
inherits all $block
's attributes. Here, we'd again have problems with attributes allowed on paths.$block
in a $root
and $text
in $block
.insertContent()
. While we plan to clean up attributes in a general post-fixer, we didn't plan to strip disallowed elements in one. Should we do that too or leave it for insertContent()
and other special use cases?$inline
and $text
? It seems that it makes more sense to have "isInline" in the future.Conclusions:
Allowing attributes and nodes on paths must not be widely used because it breaks inheritance which is far more important for extensibility reasons. Hence, we can allow it but only using callbacks. The use case for it will be special, closed and final integrations of the editor where the developer knows all the features.
We might try to treat paths in some special, inheritable way or find some completely new model for how schema is defined, but I can't think of anything worth following. Especially that the current model is the most natural so some other, fancy ones might increase the perceived complexity.
List of potential followups:
As we know,
Schema
is probably the weakest part of engine model right now. We didn't have clear requirements and idea how it will be used when we created it.Since iteration 2 is about bringing stable API, I think we need to include refactoring/rewritting/improving
Schema
. I open this issue so we can share requirements and thoughts on how it could work.Some conclusions that we aready drawn:
SchemaItem
s should be based on classes mechanism and extending.linkHref
orlinkName
but not both -- I remember there were cases like this).EDIT (by @Reinmar): I've read through the entire topic and created an excerpt from it which I'm adding below.
The following list of requirements, ideas and doubts was compiled from the discussion that we've lead in this ticket. It's mostly focused on stuff that works poorly now or which is not supported at all. The things which the current schema supports can be found in the code so the correctness of the new approach will be naturally validated when refactoring the code.
Attributes
indent
on<listItem>
,bold
on$text
.<listItem>
'stype
. Source: V->M conversion of a bare<li>
(on paste) needs to hardcode the default type now.<heading>
'slevel=1|2|3
. Source: https://github.com/ckeditor/ckeditor5-heading/issues/27#issuecomment-245559218.$text
which is in<heading1-6>
.x
on all$text
s regardless of their context (parent path). But that shouldn't allow$text
everywhere (as happens now). Source: https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-292892032 and https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-299204504.$text
in$clipboardHolder
but then all attributes which were allowed in$root > $block > $text
are not allowed in$clibpardHolder > $text
. Source: https://github.com/ckeditor/ckeditor5/issues/477. It gets even more serious with https://github.com/ckeditor/ckeditor5/issues/477#issuecomment-310428963 and terrifying with https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-340717901.bold
is allowed on$text
".linkHref
andlinkName
,sub
andsup
(in some implementations).indent
andtype
. If you'd check them one by one (e.g. when trying to uunderstand which element attrs need to be removed after some operations) you would get the false impression that neitherlistItem[indent]
is correct norlistItem[type]
, hence, both should be removed. Source: https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-325328882. Source 2: https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-341501249.Elements
<caption>
being a limit only when in an<image>
): https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-280599349. It think it would be an overkill. One needs to ensure element name uniqness.h1
is only allowed in$root1
, but not in$root2
.h1
is allowed in$root1
even as a deep node:$root > bQ > h1
.Other topics