Open adamerose opened 1 month ago
Hi! This is very unlikely to happen I think, and creating something like this would be a huge task. Everything that editor parses is represented in an internal model structure. We don't have em
or i
we have attribute on a text node with italic
. All of the features operate on this abstruct structure, and the output is just translating it to a desired format.
- Run the CKE normalization on both the original source and editor output
- Perform a diff between them to identify blocks that actually changed
- Merge those changes back into the original source
This could be one of the solutions, but would make the getData
operation even more heavy than it is today. Creating a diffing and merging heuristics would also be challenging for sure.
- All non-renderable content is lost including
<!-- comment -->
,<script>
,<meta>
, and<style>
Have you tried features like HTML Comments, or Full Page? I'm not sure how would they behave with the markdown output TBH.
- Syntax gets changed. eg Tag
<em>
becomes<i>
. Markdown bullets-
become*
- All formatting and indentation gets lost
Is it the case of always outputting what was inputted, or the matter of preferences? Both editor API and markdown output could be configured in some way.
Is it the case of always outputting what was inputted, or the matter of preferences?
The former. For example when using my plugin to just fix a single typo in a README.md
, the entire file gets modified in unrelated/destructive ways. Changing -
to *
, removal of comments, and autoformatting are examples of that.
Have you tried features like HTML Comments, or Full Page? I'm not sure how would they behave with the markdown output TBH.
Markdown comments still get lost with HTML Comments
, and Full Page
breaks the rendering causing the entire source to render as a single paragraph element. The GeneralHtmlSupport feature also seems relevant.
This could be one of the solutions, but would make the getData operation even more heavy than it is today. Creating a diffing and merging heuristics would also be challenging for sure.
After some more thought it seems like solutions would fall into these categories:
And I'm thinking it might make more sense to try improving on the last category instead of the diffing? I had some questions about this...
Could the problem of losing source formatting be solved by checking the leading space in front of each element when parsing input to create the internal model, and then just storing it as a property on the model node similar to how element properties like id
, class
, and data-*
get saved under htmlPAttributes
in the Model when the GeneralHtmlSupport
plugin is enabled?
Could you similarly store details like original tag type as an attribute? So <em>
and <i>
would both still become <paragraph>
with the italic: true
attribute, but also have another attribute like tag: "em"
.
How would you approach getting this to work with markdown as well as HTML?
I noticed that the HTML Comments
and Full Page
plugins encode the contents and positions of comments and non-renderable HTML elements into the root element. I'm curious why was that method chosen instead of just adding invisible <comment>
or <meta>
nodes stored in the Model tree alongside other nodes like <paragraph>
?
I'm looking for advice on getting around the fact that CKEditor5 always normalizes source data.
The docs say that is core behavior that cannot be changed (link), but this is a big problem if the user or developer cares about the source, for example using CKE5 to edit pre-existing HTML or markdown content. My own use case is a VS Code extension for viewing and editing
.md
files.Is there any guidance for use cases like this?
The problems are...
<!-- comment -->
,<script>
,<meta>
, and<style>
<em>
becomes<i>
. Markdown bullets-
become*
Is there any way CKEditor5 could be modified to optionally only normalize the chunks of source that were modified instead of the entire thing? If I was to solve this without rewriting the CKEditor5 internals I think I would have to do something like...
But that seems like a fragile solution, so I'm hoping for feedback.