Introduce the List feature

scofalik commented 8 years ago

Introduce List feature to CKEditor5.

Feature should follow Editor Recommendations: http://ckeditor.github.io/editor-recommendations/features/numbered-list.html http://ckeditor.github.io/editor-recommendations/features/bulleted-list.html

scofalik commented 8 years ago

First decision to make is how list should be represented in model.

1. Mimic HTML implementation.

We would have two elements: list and item. list element would have some attributes like controlling how list is displayed, like type=ordered|unordered, starting number. Indentation would be automatic. This implementation makes lists more encapsulated, it would be easier to write model<->view converters and control settings for given list. However, it could be more difficult to implement some functionalities like merging lists or "unlisting"/changing type of half of list.

2. Item element only.

In this approach, we only introduce item element. The model would be simpler and list behavior, like merging lists or changing properties of only some items would be easier. Unfortunately model->view converters would be a bit tougher to write. Also the model would be heavier because each item would have to keep some duplicated data, like indentation or list type. Another drawback is that we would not be able to create two consecutive list of same type. This should not be a big thing, unless someone wants to create a list, make some space and create another list -- we'll need separating paragraph for that.

3. List element only.

We could just introduce list element and treat paragraph elements as items. When converting to view, list converter will consume paragraphs inside it. This is variation of first approach. We don't need another model item. What we might gain is that there is already an implementation for how to merge paragraphs, so merging list items inside list would be done automatically. The risk is that, in future, there might be some attributes or functionallities that should work on paragraphs, but not on list items.

4. Attributes for paragraphs.

Instead of introducing new elements, we could introduce a few attributes that would be applied straight to paragraphs. What we gain is that many things will work out of the box:

first list item will be merged with previous paragraph
paragraph after first list item will be merged to the last item
items inside list will be correctly merged. This is somewhat similar to second solution, the risk is that we might have features in future that should work on paragraph but not on list items.

Fourth solution sounds most interesting to me, but I think it may be risky. I would go with second solution and see what behaviors that are now true for paragraphs will also work for list items. I suspect there should not be much difference between second and fourth solution and introducing new element may be more safe.

oleq commented 8 years ago

How would nested lists behave in each approach? It was a huge pain to implement them in v4 with user–friendly, predictable in(outd)dentation behaviour, to preserve structure and support edge cases.

scofalik commented 8 years ago

Second and fourth approach are more friendly for nested list. All the nesting would be done by indent attribute. Then converters will handle creating/closing ULs/OLs then. They won't be easy, that's for sure, but operating on model will be, in that case.

In first and third solution nested lists would be done like in HTML - by nesting lists. Nested list would probably end up straight in it's parent list, not item/paragraph:

list
|- item
|- item
|- list
|  |- item
|  |- item
|- item

scofalik commented 8 years ago

Editor Recommendations does not specify how button state should behave or what should happen when list button is clicked while being inside list item, or while having some paragraphs/items selected. However I think we should implement same solutions as in CKE4, these are:

if "non-list" content is selected, button is inactive,
if button is active, clicking it converts all list items to paragraphs,
if button is inactive,, clicking it converts all paragraph-like, non-list elements to list items.

The question is how we decide which elements should be converted to list items. We need to make it flexible, so other, possibly 3rd party, plugins are able to co-operate. We have to enable converting, let's say, a couple of paragraphs and block quotes, to list items, to block quotes, back to paragraphs, etc.

The other problem are elements that cannot be converted to list item, but can have list inside, i.e table cells. We won't have paragraphs in table cells so we have to know how far we can look-up from current position finding a parent that can be converted to list item. In such elements, we should not change list items back to paragraphs. Or maybe we could? I know that we are far from designing table feature but we have to know what we have to think about.

There is also question "what if Paragraph feature is not enabled"... But maybe features like List or Blockquote should require it.

Solutions that I see now:

We decide that we won't have deep and complicated structures in model, and everything that contains a text node is treated as paragraph-like element, with some exceptions (like table cells). AFAICS this is how Enter feature works right now. When splitting elements, it just takes position's parent and splits it.
Paragraph feature introduces a "registry", to which model element names can be registered. All those elements are treated like paragraph-like elements.
We go with extending module tree classes. We could have class ParagraphElement extends Element in model and implement behavior using these.
We "somehow" use Schema to see that some elements are registered as deriving from paragraph element and use this knowledge. "Unfortunately", this means that we have to start by refactoring Schema.
We implement "step by step" solution. So, each feature must know how to turn itself off (change list item to paragraph) and then how to turn itself on (change paragraph to list item). Then features would have to know that feature X has to turn off feature Y to work.

If you have any other, better ideas please share!

Reinmar commented 8 years ago

Regarding the model, <list-item> element (2nd solution) makes the most sense for me. The whole tricky logic will be hidden inside converters and hence nicely encapsulated. All the algorithms operating on the model will be super simple and this is what most people will write. What I mean is that the hardest job (converters) will be done by us and the final features' behaviour will be a piece of cake to implement.

Also the model would be heavier because each item would have to keep some duplicated data, like indentation or list type

This is neglectable.

Another drawback is that we would not be able to create two consecutive list of same type. This should not be a big thing, unless someone wants to create a list, make some space and create another list -- we'll need separating paragraph for that.

I've been thinking about that.

First, in 99% cases you don't want to create separate lists one after another because there's no sense in that, so I'm strongly for forbidding those.

Second, AFAIR Google Docs solved this by adding list ids to list items. If subsequent items have the same id, they are added to the same list.

So this can be our solution if anyone will ever need such a future, which I doubt because CKEditor 4 is merging lists in many cases and I don't remember any report about that. You simply don't see whether two lists are separate. It may only matter if you loaded such a list using setData() but there's an important rule that we always followed – if you cannot create something using the editor, then you shouldn't load it there.

scofalik commented 8 years ago

First, in 99% cases you don't want to create separate lists one after another because there's no sense in that, so I'm strongly for forbidding those.

Yes, this is a conclusion that we came with after discussing the ticket with @pjasiun, so it's good that we all agree. I also noticed that CKE4 already behaves like this. Still, it is a nice to have a backup plan if we need it (list id).

Reinmar commented 8 years ago

BTW, if we go with the 2nd solution, current implementations of the enter and backspace features should already handle most of the cases. Even many cases with nested lists should work, because in the model they will be represented by a flat list of elements (similar to paragraphs) and nicely merged or split.

Reinmar commented 8 years ago

BTW2, regarding encapsulation of the most tricky logic. There's bazillion of cases with indenting one or many items, outdenting one or many items, outdenting by enter, outdenting by backspace, splitting items (enter), creating new items (enter), merging items (delete), removing pieces of tree (delete contents), etc, etc. to implement.

In CKEditor 4 we've got a few kLOC which handle these cases and despite fixing many, many bugs, there are still some glitches in edge cases. The reason is that each of these algorithms has to care about details of how lists should be rendered and their complicated structure. Additionally, if you wished to add or change some behaviour of the list feature (and related features such as backspace and enter) you'd need to alter incomprehensible number of cases.

Encapsulating this complex logic in converters will be extremely complex, I can imagine. But there's a closed set of operation and structures. We can start with flat lists (highly recommended) which will significantly speed up the initial work. Later on we'll have to spend some significant time on adding nested structures support, but we'll do this in one place, once. There will be bugs, I'm sure, but again – in one place :).

Reinmar commented 8 years ago

As for the schema, I'd choose the easiest paths for now, as we'll be working on the schema improvements (https://github.com/ckeditor/ckeditor5-engine/issues/532) and current implementation doesn't have many features.

E.g. – which elements can be converted to list items? Those which inherit from $block. If we can't check that, then "those which can contain text", etc. Something simple that makes some sense in today's narrow scope.

Just make sure to keep that logic in some meaningful methods so we can understand and easily update it in the future, once schema will be improved.

Also, I propose to post all these cases that you'd like the schema to support under https://github.com/ckeditor/ckeditor5-engine/issues/532 so we've got a material to work on in the future.

oleq commented 8 years ago

Option 2. looks good to me too. I only hope it will not become too tricky to implement. I'm also OK with the first implementation covering flat lists only and progressive enhancement in the future 👍

Anyway, as for DEL/BACKSPACE behavior, there's a lot of tests in v4, which could (should) be ported to v5 so, at least, the process figuring out all the possible cases would be done. The same with indent/outdent tests and enter tests in lists, which describe various transformations between nested lists, list items and underlying/neighbouring blocks. So there's a lot of stuff to study 😄

scofalik commented 8 years ago

List item approach with indent attribute may be problematic in collaborative editing. Consider following list:

* A               <listItem indent=0>A</listItem>
  * B             <listItem indent=1>B</listItem>
    * C           <listItem indent=2>C</listItem>
    * D           <listItem indent=2>D</listItem>

Let's say that one person outdents * B and another indents * D. We will have a conflict for * D item because one client will change it to 1 and other to 3. If we want to follow users intentions, correct value would be 2, but 1 is also fine. Unfortunately, 3 is incorrect. So we may get a bad result. I sense that similar problems may occur for undo and clipboard.

scofalik commented 8 years ago

Another small issue is when you have two listItems separated with paragraph. When you remove paragraph, the lists should be merged. I am worried this won't be easy to solve.

scofalik commented 8 years ago

Aaand another issue, this time converting from view to model. When single list items are converted (say, somebody selected and copied one <li> element, without <ol> or <ul>) it will be difficult to set indent and type attributes.

Reinmar commented 8 years ago

Another small issue is when you have two listItems separated with paragraph. When you remove paragraph, the lists should be merged. I am worried this won't be easy to solve.

You mean – upon conversion? Yes, I also think this will be tricky. But that's exactly what makes it a perfect candidate for a code that you'd love to keep in one, safe place, not spread across many algorithms (which would need to merge the lists as well).

Reinmar commented 8 years ago

Aaand another issue, this time converting from view to model. When single list items are converted (say, somebody selected and copied one <li> element, without <ol> or <ul>) it will be difficult to set indent and type attributes.

This should be handled when copying a list item. In the clipboard it should rather be written together with its ul/ol parent. And as for the indentation – note that when you copy a single item its indentation doesn't matter. What matters is the place where you paste it. So after conversion you'll simply get a <listItem> with a specific type (type can also default to some value if it cannot be recognised from the HTML).

Reinmar commented 8 years ago

List item approach with indent attribute may be problematic in collaborative editing. Consider following list:

Hmmmmmmm..... hmm..... This is tricky. I even checked how it works in Google Docs. And we cannot compare with them, because they have an HTML-incompatible lists implementation. E.g you may have this:

<listItem indent=0>a</listItem>
<listItem indent=2>b</listItem>

So when we tested the case that you described, it didn't blow up – indent simply changed one item and outdent another one. None of them affected sublists.

Having HTML/MD as a primary output we can't do stuff like this.

I started wondering whether:

This is really something that is dangerous. It may simply be an incorrect, but acceptable result. To answer that we'd need to know whether it affects undo.
Perhaps we can store the indentation differently – e.g. as a diff with the previous item?

<paragraph>Foo</paragraph>
<listItem>1</listItem>
<listItem indent="1">1.1</listItem>
<listItem>1.2</listItem>
<listItem indent="-1">2</listItem>

If I'm thinking correctly, if you're changing indentation of a single item it will only change one attribute. Never more, unless you created a non-collapsed selection, but even then you should be safer. We'll never get indent values that differs more than one level, because operation changing the attribute will either set 1, -1 or null. You won't get indent=3 after indent=1 as a result of some operations.

Reinmar commented 8 years ago

I know that the idea is crazy and that the model will not be that intuitive, but it may solve other issues.

scofalik commented 8 years ago

You mean – upon conversion? Yes, I also think this will be tricky. But that's exactly what makes it a perfect candidate for a code that you'd love to keep in one, safe place, not spread across many algorithms (which would need to merge the lists as well).

The problem is that I don't know where this code should be put? Maybe in low-priority remove event callback, that would be fired for removing any element. Because I can't put it in listItem converters, cause in this case you are removing paragraph/image/what-have-you.

This should be handled when copying a list item. In the clipboard it should rather be written together with its ul/ol parent. And as for the indentation – note that when you copy a single item its indentation doesn't matter. What matters is the place where you paste it. So after conversion you'll simply get a with a specific type (type can also default to some value if it cannot be recognised from the HTML).

But when you copy from outside source you might get <li> without it's parent. I don't know how browsers handle this, though. Maybe <ul>/<ol> is also in copied content, IDK, but I fear that it would be just lonely <li>.

Of course, when you copy flat set of list items, indentation does not matter. But you cannot paste a listItem without indent to the model. Setting indent will have to be somehow handled upon adding to the model. I am not sure how, now.

I know that the idea is crazy and that the model will not be that intuitive, but it may solve other issues.

This idea is not crazy and it is exactly what I've been thinking about after reporting problems with current solution. It solves not only OT problem but also indent problem (unfortunately not type problem). As I was thinking about it yesterday I came up with another problematic scenario:

<listItem>1</listItem>
<listItem indent=1>1.1</listItem>
<listItem>1.2</listItem>

Consider now user A outdenting item 1.1 and user B outdenting item 1.2. These are both correct operations, but results in 1.2 item being "outdented to level -1". Maybe possible solution would be to store another helper value.., but I am not sure how to solve it. Or -- post-fix and/or just treat all levels below 0 as 0. Post-fixing to 0 seems better because if your level is -1 then indent won't do anything. OTOH, IndentCommand could just fix it when during it's execution. WDYT?

Reinmar commented 8 years ago

The problem is that I don't know where this code should be put? Maybe in low-priority remove event callback, that would be fired for removing any element. Because I can't put it in listItem converters, cause in this case you are removing paragraph/image/what-have-you.

I don't know enough about converters to be sure about this, but late remove listener for all elements, checking whether that removed element had been separating two lists sounds good.

But when you copy from outside source you might get <li> without it's parent. I don't know how browsers handle this, though. Maybe <ul>/<ol> is also in copied content, IDK, but I fear that it would be just lonely <li>.

The converters will focus on <li> elements anyway if we go with <listItem>, so a lonely <li> will get converted to something. It's a question about attributes, but I really don't think that this is a problem. If <li> has a parent, then read it, if not, then default to something.

Besides, if we go with the indent=-1/1 approach, then indent doesn't have to be set. No ident means "follow your predecessor", which is what should actually happen when you paste such an item.

Consider now user A outdenting item 1.1 and user B outdenting item 1.2. These are both correct operations, but results in 1.2 item being "outdented to level -1".

After the OT was resolved we'd get something like:

<listItem>1</listItem>
<listItem>1.1</listItem>
<listItem indent=-1>1.2</listItem>

Funny thing here is that the last item (1.2.) should actually become a paragraph and that's what the postfixer could do and that's what the model is actually saying. Because if you outdent a list item (when you have a flat list e.g.), then it should become a paragraph too. This looks really nice.

OTOH, IndentCommand could just fix it when during its execution. WDYT?

Command should never create an incorrect structure, so it's just a matter of fixing results that become incorrect because of OT. I think that those should be fixed ASAP. E.g. the case that we discuss above with one item at the end which has indent=-1. It should be converted to  but would be processed by the list converters which would need to produce a ... Smells bad.

scofalik commented 8 years ago

It's we who decide what is a correct model. If we decide that indent=-1 is correct, it is correct :trollface: .

Anyway, I don't think that in described case last item should be converted to paragraph. It was neither users intent to do so. Both users after their changes see it as "level 0" item.

Reinmar commented 8 years ago

Anyway, I don't think that in described case last item should be converted to paragraph. It was neither users intent to do so. Both users after their changes see it as "level 0" item.

Maybe you're right. Let's see then – we've got at least two options – ignoring this when converting or postifxing.

scofalik commented 8 years ago

So, we discussed more and came to the conclusion that since we need post-fixing for both solutions, we will try to go with more intuitive, "old" solution, that is that indent attribute will be explicit. Unless, of course, more problems appear.

Now for something different. I've been thinking about cases and solving them for insertion converter. I tried (as I always do) to go with all the cases in the world, but then noticed that some are not possible (with only ListFeature, still they might be possible by copy pasting).

Here is the list, and let's make sure that's complete:

Totally new list. Changed paragraph to listItem. Neither previous nor next element is listItem. Create ul or ol in the view, then create li and put it inside list.
New list after existing list. Changed paragraph to listItem that was after other listItem. This should create "level 0" item (with old approach it means just setting indent attribute to 0).
New list before existing list. Changed paragraph to listItem that was before other listItem. There are no listItems before new listItem. Since new listItem should have indent=0 and first listItem on the list also has to have indent=0, it's enough to just create new li and put it at the first position in existing list.
New list between two lists. This is tricky because above and below lists has to be merged.
New list item inside a list. This happens after clicking enter key in listItem. New listItem is created but all it's attributes are copied. This behavior should be supported by enter key feature. It would be nice if point 4. and 5. were done by same algorithm as this is a kinda similar scenario, but we will see. I don't know how difficult it will be to differentiate between 4. and 5. because model will look very similar in those cases. Anyway this is an easy scenario, we only have to create li and insert it at appropriate position in view.

EDIT: There might be other scenarios for pasting, though.

EDIT2: Keep in mind that we are only discussing creating new listItems and converting insertion. The examples assumes same type attribute. If type attribute is different, then we cannot merge lists and have to create new ols/uls. But it is worth noting that this only happen when creating a new list litem out of paragraph or other block content (not when creating new list item by clicking enter or anywhere inside existing list).

scofalik commented 8 years ago

For scenario number 4., perhaps merging could be done in remove event, that would also deal with removing elements that are between two lists and merging those lists. If that's the case, the scenario is similar to scenario 2.

This will probably work nicely, if rename conversion is insert + move + remove. But it should also work for simplified version of rename conversion which is remove + insert.

For scenario 2, to make it more general, what we can do is:

in model, look for previous listItem that has the same indent,
when found, get corresponding view element using mapper,
create li after found element.

Then move and remove converters will move contents and merge lists in view.

scofalik commented 8 years ago

This solution will nicely cover 5. scenario. Newly created li element will be added at correct position in view and we don't need merging. But since merging is in move/remove converters, there is nothing about it in insert converter. This means that those case are (almost) same but all the difference is in another converter. Which is needed anyway for other scenarios.

I guess I am slowly tearing this apart to small pieces. Hopefully I haven't make any mistake in the process :P

scofalik commented 8 years ago

For pasting, we need to make sure that first listItem in pasted batch has indent=0. This will be done by view to model converter, of course. Pasting will result in inserting one or more listItems into model. This will fire conversion. Since items will be processed from the first one to the last one, most cases will be solved by default algorithm.

We need to handle one special case. In scenarios 2, 4 and 5 we are looking at previous listItem and appending li to existing ul/ol at correct position. Until now we assumed that new listItem has indent same (scenario 5) or lower (more shallow) (scenario 2 and 4) than previous listItem. When pasting list, we will often paste already indented list items. That means that we will have to expand those scenarios to also handle listItem that has bigger (deeper) indent that listItem before. This should not be hard, though.

Another thing to decide is what should be expected behavior when a list is pasted inside a list. Should new list items be pasted starting from indent=0 or should they indents be incremented by the indent of the list item at which we have pasted. Consider that we paste XYZ list at B item:

* A          * X
  * B          * Y
  * C          * Z

Should the result be like on left or like on right:

* A                                  * A
  * B                                  * B
    * X                              * X
      * Y                              * Y
      * Z                              * Z
  * C                                  * C

I think like on left, but this will need some callback that will be fired before/after pasting, because this cannot be handled in coverters. This fixing callback will have to check whether we are pasting inside a list and if so, fix indents of pasted content.

Also three more things:

Pasting inside empty paragraph should remove the paragraph (then default algorithm applies).
Pasting may need to execute lists merging (for example, if list is pasted in empty paragraph between two lists),
Pasting inside empty list item should make pasted content "replace" the empty list item, not be pasted inside it:

* A
  * B
  *
  * C

Consider pasting XYZ list in empty list item above. Expected result is:

* A
  * B
  * X
    * Y
    * Z
  * C

not

* A
  * B
  *
    * X
      * Y
      * Z
  * C

scofalik commented 8 years ago

Hopefully the last issue concerning pasting is dealing with list items of different types that are inserted inside existing list.

This is not a problem for normal ListFeature behavior because you can create list item of different type only before or after whole list. Inside list you can only create list items of same type, using enter key.

This is not true for pasting. You might get one of following cases:

* A                  * A                  * A
  * B                  1. X                  * B
  1. X                 * B                   * C
  * C                  * C                   1. X

In the first case, we have to map * B to view, and find it's parent ul and split it at the position where 1. X was inserted. Then we have two uls and their content split. Between those uls we insert ol and create li inside it. Other cases are simpler as we do not have to split anything.

Good news is that after processing first item in pasted list, all the other will work following already described algorithm.

scofalik commented 8 years ago

I stumbled upon another problem with using just listItem elements. This time it is connected to mapping. Consider following model and view. listItems are mapped to <li> elements. <ul>s are not mapped:

<paragraph>Lorem.</paragraph>
[<listItem>A</listItem>]
{<listItem>B</listItem>}
<paragraph>Ipsum.</paragraph>

<p>Lorem.</p>
[<ul>
  <li>A</li>]
  {<li>B</li>
</ul>}
<p>Ipsum.</p>

Because of how position mapping works, it's supper annoying to have view container elements that are not mapped to anything. If we interpret position [ as "before listItem" it looks wrong. But if we interpret it as "after paragraph" it suddenly makes sense. So in fact we have this disambiguity.

I am sure we will find this problem a big, annoying issue. Right now I have a problem with removing first and last listItem because in view ranges on those listItems are not translated to flat ranges.

scofalik commented 8 years ago

I see four solutions, for move/remove converters.

Use element mapping if range start is before a model element (not inside text). Same for range end (after element). This looks more like a hack, though it has some sense. Since we have a range and range start/end has a meaning, we use this additional info to precisly set view range to be processed.
Enhance move/remove conversion - make pipeline of it:
- default converter instead of calculating view range to be removed should expect data.viewRange,
- add additional "default callback" that will calculate data.viewRange in the way it is calculated now in default remove converter,
- let features add their own calbacks that can modify/overwrite data.viewRange.
Enhance move/remove conversion - so every node is processed alone. Right now, i.e. insertion conversion, walks through inserted range and fires an event for each inserted node. remove/move conversion dispatching is simplified, it just fires one event for the whole range.
For List feature, and any other feature with similar problems, add custom remove/move converter that will:
- walk the removed range and do some custom stuff on recognized items (listItems in this case)
- fire default event on parts of the range that does not have recognized items (listItems in this case)
- stop this event execution (if there was at least one recognized item, otherwise let it go and use default converter).

Reinmar commented 8 years ago

+1 for option 2.

scofalik commented 8 years ago

The bad thing about option 2 is that we would have two workflows for converters and that's why I came up with other solutions. Insertion / attribute changing would work one way and removing/moving would work the other way.

OTOH it already works a bit different now... So IDK. @pjasiun?

scofalik commented 8 years ago

Honestly, I am for option 3 because that would give the most control over converters and would make writing converters similar. I am concerned about efficiency though, that's why I want to ask PJ first.

pjasiun commented 8 years ago

+1 for option 2.

Option 3/4 was my original idea, but it appeared not to be doable. There is one action of moving/removing model elements and converters are fired after that action. The range is messy anyway because we need to calculate view range (which was not moved) based on the moved model range. If every node will be processed separately keeping in mind non-trivial mapping (like in this case) it may appear to be too hard to calculate proper ranges for each item.

pjasiun commented 8 years ago

As I think about mapping there is another problem.

The mapper was designed this way that you always map the position next to one element to the position next to another element. There is a ticket about mapping inner and outer differently (https://github.com/ckeditor/ckeditor5-engine/issues/266), but here we have another case.

If you have in model:

<listItem>1</listItem>
<listItem>2</listItem>
<listItem>3</listItem>

And in view:

<ul>
    <li>1</li>
    <li>2</li>
    <li>3</li>
</ul>

And the listItem <-> li mapping there is no mechanism which will tell you that the possition should be outside or inside <ul>. What is hapaning now is a quirk, because mapper was never designed to deal with such situaltion.

pjasiun commented 8 years ago

What we could do is improving mapper, so it can have fixed mapping for defined position, so the position after the last ListItem should be mapped after the <ul>. This fixed position mapping should be manually updated, but it may be a way to go.

scofalik commented 8 years ago

The mapper was designed this way that you always map the position next to one element to the position next to another element. Was it?

In fact you just map elements from model to elements from view and position is calculated on the fly. Current algorithm guarantees that you always get a position in top-most mapped element - if there is ambiguity.

Why can't we treat it as expected behavior? Let's write tests, write it in documentation and call it a day. Instead of complicate it when it is not really needed by the feature.

What we could do is pass true/false flag to .toViewPosition() and .toViewRange() how mapper should work if there are several possible positions. The flag would then decide whether mapper should choose top-most mapped element (default, behaves like now) or deepest mapped element (so it will be next to <li> in our case).

pjasiun commented 8 years ago

Why can't we treat it as expected behavior? Let's write tests, write it in documentation and call it a day. Instead of complicate it when it is not really needed by the feature.

We can try, but this behavior already changed multiple times, for instance, to put position in the text node if it is possible. But in may be the way to go.

What we could do is pass true/false flag to .toViewPosition() and .toViewRange() how mapper should work if there are several possible positions. The flag would then decide whether mapper should choose top-most mapped element (default, behaves like now) or deepest mapped element (so it will be next to <li> in our case).

I'm against it. First, the mapper is already complicated. Second: it solves nothing because default converter would always behave the same way, anyway.

scofalik commented 8 years ago

We can try, but this behavior already changed multiple times, for instance, to put position in the text node if it is possible. But in may be the way to go.

Those two behaviors do not affect each other that much. The only way it may happen is when text is directly next to element in model (and the element is "special", listItem):

<listItem>foo</listItem>
<listItem>bar</listItem>
^<$text>someText</$text>

<ul>
  <li>foo</li>
  <li>foo</li>
</ul>
someText

In this case position after second listItem will be put in someText view text node. The important thing still is that the position has not been placed after <li> but after <ul>.

I am well aware that this is not a perfect solution and this may change in future. We might introduce fixed positions in mapper, as you suggested. Or come up with something else. But for now we need something that works. This works and there are no real case scenario where it blows up. So for now I'd like to assume that this is expected behavior. It always behaves the same, so there's no fear it will work differently if there is different model structure.

Reinmar commented 8 years ago

Just a small comment, without reading the previous posts – this is a totally legit case which the mapper should handle. Even if we wouldn't decide that lists will be represented in the model in this way, there's a very high chance that there will be other features like blockquote or code blocks or some widgets, in which such a mapping scheme would need to be supported.

pjasiun commented 8 years ago

I only mean that we should keep mapper simple. We may try to revert decision about moving position to the text node and decide that mapper will always return the most outer possible position and it is view writer who ensures that position in text works fine. Current solution: "writer has to return the most outer possible position, unless it is position next to the text node, then the position should be the most inner" may works for now but will be terrible when we will add more features to mapper and want to refactor it.

Reinmar commented 8 years ago

I'm lucky to have v. little knowledge about how it works so I can write "This can't be that hard." :D

scofalik commented 8 years ago

There is an issue with the approach we took with having only listItem elements. Unfortunately this is a big roadblock which may make us change that approach.

As an introduction to the problem I have to say that mapping model to view in this approach is difficult. Mapper assumes that each view element that has mapping to model has in model length 1. This makes a lot of sense. Position offset before view element and after view element differs by 1. If that view element is mapped to model element, positions in model should also differ by 1.

Unfortunately, this is not true in our case, because <li> can have other <li>s but in model, these are separate elements. Here is problematic case: View:

<p>x[xx</p>
<ul>
    <li>
        aa]a
        <ul>
            <li>bbb</li>
            <li>ccc</li>
        </ul>
    </li>
</ul>

Model:

<paragraph>x[xx</paragraph>
{<listItem indent=0>aa]a</listItem>}
<listItem indent=1>bbb</listItem>
<listItem indent=1>ccc</listItem>

First of all, take a look at {} selection. What the heck does it really represent? Okay, you may say it represents that one bullet point. Now, good luck mapping {} selection on the view. listItem is mapped to <li>. In the view, you cannot have position after listItem "aaa" but before listItem "bbb". This is impossible.

This leads to following problem. Look at selection []. Now, if I press delete, what should happen is: xx should be deleted, aa should be deleted and paragraph should get merged with first listItem. Merging involves removing obsolete element, so we need to remove such range:

<paragraph>xa</paragraph>
[<listItem indent=0></listItem>]
<listItem indent=1>bbb</listItem>
<listItem indent=1>ccc</listItem>

However, as we just said, it is impossible to map that range on view. There is another problem: after removing such range we would have incorrect model, because first listItem would have indent=1.

What can be done?

Provide custom deleteContents callback and do something in it. This really smells, as, in fact, deleteContents works correctly: it breaks selection to flat ranges, removes stuff in selection and then tries to merge elements. I could write some custom code that would handle leftovers after characters got removed. OTOH, removing is a bit like outdenting. If we would first outdent the selected listItem we could then easily remove it. Unfortunately this would not work for moving listItems or if they are removed outside of deleteContents.
Change how remove/move converters work. Here I once again propose that remove converter should work like other converters. @pjasiun was saying that it would be problematic because of mapping. I haven't researched it yet, but if you remove in reverse order it is easier because maps for not-yet-removed elements are okay. The described issue would be much simpler to handle if those converters work that way.

What now?

Honestly, I already hacked engine a bit during all the prototyping I've done. These are not big changes but I am worried that I am already trying too much to prove that we can get by with listItems only. I could try changing how remove/move converters work but this is an additional day of coding. It's true that we gain something by having simpler model, but still we have to write a lot of logic and complicated code in converters (which are all about merging anyway). So I even have doubts whether the gain is real.

On the other hand, most stuff works correctly, there are some bugs with outdenting but nothing that big (and it should work on paper so if I haven't missed something this is and implementation bug). There are also small issues with undo but I am not sure if this is undo itself or converters were written without ackonwledging some corner cases that might come up when undoing.

At this moment I am not sure whether it is better to scratch the idea and start from the beginning, or try to fix all the issues. If we start from the beginning, we would probably need more time to finish this feature (honestly, most of my work will go to waste, except maybe that I am now more experienced on what edge case scenarios exist). If we decide to continue with listItems it may appear that more issues will pop up and I will have to hack engine even further.

Reinmar commented 8 years ago

The question I'd ask is whether we can imagine any other feature which requires similar mapping schemes.

E.g. we're pretty sure that widgets will require some tricks. In the view you'll want to have more things than you'll keep in the model. Also, we can imagine some integration scenarios where you may want to render some features differently than we designed it. We may want to implement blockquotes as attributes on headings and paragraphs (to avoid the same problems as with lists).

This is just the beginning that we use the engine. Lists aren't some simple feature, so I can't say "if we can't implement lists this way, then the engine is broken", but this situation should definitely alarm us. Because there will be weird scenarios to solve and if engine requires that the model is mapped 1:1 to view (with the exception of widgets internals, where nested editables should be pretty easy to map), then perhaps the engine needs some improvements. Perhaps even drastic in some places, but this happens if a piece of software was written without a full knowledge about possible use case.

We should also think from a bit different perspective. How we designed lists in the model sounds great from perspectives like operations on them and collaboration. The model is minimal and e.g. there are less invalid (from text editing perspective) positions than in the respective view. If we'll decide to change the model to align it to the view's structure, then we'll save time on engine corrections, but I guarantee that we'll then waste time on implementing features and we may hit serious troubles with OT, because model operations will be very complex for pretty frequent editing operations.

As far as I understand, there are no issues which you consider unsolvable. If that's true, then IMO it's still reasonable to work on this. In fact, it will be totally understandable even if the whole mapper or something needs to be rethought. It will only become unreasonable to dig deeper if we'll understand that the changes in the engine require so much time or will improve complexity so much that time saved on implementing and supporting lists and other features will not exceed that. And I think that this can really be counted in months.

Reinmar commented 8 years ago

This leads to following problem. Look at selection []. Now, if I press delete

Such selection is invalid. So we should only talk about this in terms of "if someone wants to delete such a range". This is of course possible, but shifts priorities.

Provide custom deleteContents callback and do something in it.

The deleteContents and modifySelection algorithms in the composer as well as features like enter key were designed in a way that you can plug your behavior into them. They were "opened" exactly for cases like lists. The default delete contents implementation would need to know about every possible feature in the world, which is neither possible in case of a library, nor allows keeping the code size low. In CKE4 those algorithms are centralised, but this makes them huge (what bloats even super simple editors) and still incomplete.

So don't worry if an existing behaviour doesn't cooperate well with lists. It can and even should be changed.

Last but not least, I can imagine that at conversion level all operations on all possible ranges should at least not throw upon their conversion to the view. However, they don't have to have some really brilliant outcome unless these ranges are valid in editability terms. So e.g. a range that ends between </paragraph> and <listItem> IMO may be handled poorly as such range will not come straight from the UI. It's just a matter of writing tests even for those imperfect case, so we are sure that the behaviour doesn't change with time. People will base on those at some point.

I hope this helps :).

scofalik commented 8 years ago

I understand purpose of deleteContents build on events. I even thought this is the solution while I was in the middle of writing previous post, but then I came to conclusion that it would not work if someone just decide to remove such a range and still we would need some behavior for moving.

Moving is mostly connected to drag&drop and I belive there will be a solution similar to deleteContents there, but still, what if someone just move such a range?

I don't understand why [] is invalid selection? Have you mistaken it with {}?

For now I will try to change conversion dispatching so that remove and move will work like other change types.

Reinmar commented 8 years ago

I understand purpose of deleteContents build on events. I even thought this is the solution while I was in the middle of writing previous post, but then I came to conclusion that it would not work if someone just decide to remove such a range and still we would need some behavior for moving.

I think that you'll use deltas directly only in a very specific cases, where you operate on the content you know and on simple ranges which behaviour you can understand (which gets tricky once the content gets unpredictable). You'll e.g. never delete range of contents that your feature doesn't own. For that you'll use the generic deleteContents algorithm.

In the future we'll have more of those higher order helpers. There will be a set of functions which represent the most basic operations – inserting, deleting and extracting the content.

Moving is mostly connected to drag&drop and I belive there will be a solution similar to deleteContents there, but still, what if someone just move such a range?

D&D will perform extract & insert, because the content will need to go through the whole clipboard pipeline. No one should move a piece of content which isn't own by a controlled feature. E.g. we agreed that move operation's main use cases are other operations like wrap.

Reinmar commented 8 years ago

PS. I may be a bit biased, because in CKE4 insertHtml was 2 months of work, 1000L of super complicated code using dozen other complicated methods and test which took a couple of seconds to run (despite running them in one frame) :D. In CKE5 those algorithms are incomparably simpler. But still, the basic operations that we have do not cover those scenarios, so that's why I say that we'll need a higher-order tools anyway.

Reinmar commented 8 years ago

I don't understand why [] is invalid selection? Have you mistaken it with {}?

I meant the following selection:

<paragraph>xa</paragraph>
[<listItem indent=0></listItem>]
<listItem indent=1>bbb</listItem>
<listItem indent=1>ccc</listItem>

In the DOM you can't have such selection. Regardless of how you'll map it to the view, the selection must stay next to a text in those case. It can never be between blocks.

OTOH, the position before first list item may be valid if e.g. you have an object that preceeds it:

[<image></image>]
<listItem indent=0></listItem>
<listItem indent=1>bbb</listItem>
<listItem indent=1>ccc</listItem>

So the image may be selected, so position between it and its sibling is valid. Deleting such image will need to leave the <ul> untouched of course.

scofalik commented 8 years ago

I know that [] as selection is invalid from the DOM perspective, but it is a very valid range and this is a real case scenario that happens during merging.

Because there will be weird scenarios to solve and if engine requires that the model is mapped 1:1 to view (with the exception of widgets internals, where nested editables should be pretty easy to map), then perhaps the engine needs some improvements. Perhaps even drastic in some places, but this happens if a piece of software was written without a full knowledge about possible use case.

Right now engine requires that top-most element of a structure in view is mapped to model. So widgets are enabled (all widget internals are enclosed in one widget holder which is mapped to a model element).

Unfortunately this listItem thing is a very very complex scenario. Not only model element creates elements in view "above" it -- that is solvable. Problem is that elements, which are siblings in model, are parent-children in view. This is much more complicated than it appears, because mapping becomes a hell.

One other possibility is to wrap listItem contents in some view element, that might be tranparent when converted to HTML: Model:

<listItem indent=0>aaa</listItem>
<listItem indent=1>bbb</listItem>

View:

<ul>
  <li>
    <liContent>aaa</liContent>
    <ul>
      <li>
        <liContent>bbb</liContent>
      </li>
    </ul>
  </li>
</ul>

HTML

<ul>
  <li>
    aaa
    <ul>
      <li>bbb</li>
    </ul>
  </li>
</ul>

This would be much easier to handle in mapping, because listItem could be mapped to liContent (but probably would bring even more complication in converters).

Anyway my point is that when you have elements that are siblings in model but children in view you have mess. Position mapping becomes unpredictible. I am really scared of that and I can imagine a lot of bugs coming out of that hole.

Reinmar commented 8 years ago

TBH, I don't feel that I'm able to help you with the mapper... You can either bother @pjasiun or we'd need a longer talk so I can understand some limitations first.

scofalik commented 8 years ago

For example take this: Model:

<root>
  <listItem indent=0>aaa</listItem>
  <listItem indent=1>bbb</listItem>
  <paragraph>xxx</paragraph>
</root>

We will convert it to view. After listItems are converted, we have to find position for :

<div>
  <ul>
    <li>
      aaa
      <ul>
        <li>bbb</li>
      </ul>
    </li>
  </ul>
</div>

In model, position for  is in root, offset 2. Before my changes, Mapper assumed that each mapped element has length 1. So Mapper checks current view: <ul> is not mapped, it is skipped. <li> is mapped, it has length 1. Then there is closing </ul> tag. We are after all view content and it appears that offset 2 is incorrect.

So I wrote a quick upgrade that let me specify those lengths. Now, <li> elements count all <li>s inside it and return that count+1 as it's length. So the <li> in example above has length 2 and  can be placed correctly after <ul>.

<div>
  <ul>
    <li>
      aaa
      <ul>
        <li>bbb</li>
      </ul>
    </li>
  </ul>
  <p>xxx</p>
</div>

Now, consider mapping such position:

<root>
  <listItem indent=0>aaa</listItem>
 ^<listItem indent=1>bbb</listItem>
  <paragraph>xxx</paragraph>
</root>

This is parent root, offset 1. So once again we traverse view. We spot <li> that has length 2. It means that position we look for is inside it (we look for offset = 1). So we get inside it and we see text aaa. Found position is after first a. Because texts are not mapped, they are mapped by their parents.

Really, we could hack and smack but this looks like an unsolvable issue or an issue which solving takes too much effort. If anything, I'd try with transparent elements.

ckeditor / ckeditor5