ckeditor / ckeditor5

Powerful rich text editor framework with a modular architecture, modern integrations, and features like collaborative editing.
https://ckeditor.com/ckeditor-5
Other
9.64k stars 3.71k forks source link

Paste from Word support #1003

Closed Reinmar closed 6 years ago

Reinmar commented 6 years ago

Since we're soon starting to work on support for paste from Word, we need a ticket :)

The current state of things is that since content pastes from Word is html-ish-like-something some content will appear in the editor. However, since it's not really an HTML the structure of the content (its semantics) is lost.

What's the plan to improve this situation? We'll follow the architecture implemented in CKEditor 4:

  1. Normalize the crazy input to something which is a valid and correctly structured HTML.
  2. Filter out unnecessary (disallowed) stuff.

The latter is already done by how the conversion mechanism works in CKEditor 5. Basically, things that aren't handled by the loaded feature are automatically dropped (since they have no representation in the model). We may need to work, though, on stabilising this mechanism in edge cases.

Most of the job needs to be done in the first step. I'll just quote myself here:

HTML exposed by Microsoft Word does not comply to any imaginable rules. It is a poetry of what can be done wrong.

We need to design a system for processing this input. I'm not sure about this yet but perhaps it will extend the view layer as the processing can be done based on this structure (or on the DOM). This may benefit some other use cases in which you may want to pre-process view structure before converting it to the model.

pcrozer commented 6 years ago

Hi there - any news/timescale on paste from word? (tables in particular!) Would be a brilliant feature to have.

Reinmar commented 6 years ago

No, not yet. I wouldn't risk guessing now. But we'll be certainly trying to bring this iteratively, to have some MVP ASAP and then improving it over time. Just like we did with tables recently.

f1ames commented 6 years ago

I went through most common MS Word formatting options which relates to the content and could be potentially used in the editor, we have:

Text formatting

Alignments

Lists

Tables

Other objects

Others

Of course all this formatting options may occur at once (e.g. nested list with some colored text inside a table).


One thing to notice is that some of the above are not supported at all in the editor at the moment. Since we would like to work on PFW iteratively to bring MVP ASAP as mentioned by @Reinmar:

But we'll be certainly trying to bring this iteratively, to have some MVP ASAP and then improving it over time. Just like we did with tables recently.

We will focus on supporting basic formatting first and then proceed with more complex input. To prioritise thing a little I suggest a following order for a start:

  1. bold, italics, underline, strikethrough
  2. headings
  3. links
  4. lists
  5. images
Reinmar commented 6 years ago

OK, all that we wanted to do in the current iteration is done. We track issues and feature requests in https://github.com/ckeditor/ckeditor5-paste-from-office.