Closed ile closed 5 years ago
This topic was briefly discussed here: https://github.com/ckeditor/ckeditor5/issues/1366. The short answer is: it probably is possible to implement those transformations using what is already available in autoformat feature but those kinds of transformations weren't on our mind when we created InlineAutoformatEditing
so I can't guarantee you that it will work flawlessly or that the integration will look good.
Ok.
It would be nice to have a ckeditor-typography package, which would do something like this:
https://github.com/pnevyk/tipograph https://github.com/sapegin/richtypo.js
I know I could perhaps create it myself, and maybe I will, but if you did it, it would probably be correctly done from the beginning...
Agree. We actually planned this package and there might've even been a ticket for it (can't find it though). The autoformatting feature is about formatting, so it's a bit different topic. Text transformations would be a separate thing.
I'm moving this ticket to the main repo and confirming.
We were unable to fit this feature in the current iteration, however we added mentions (#998) which included some APIs (most importantly text watcher) that will be helpful while implementing this feature. We're looking to implement it during next iteration.
@mlewand the text watcher will probably be moved to other place. Also it needs some work to make it more usable - right now it is focused on the Mention feature needs. This should be properly exposed in next iteration.
@mlewand Some insights about the feature:
What to ~change~ transform as we can go crazy here:
Name of the feature (thus repo name) Text transformations? Typography?
How/what to configure:
Probably we should go with a set of default, most requested options: dashes, quotes, maybe some ligatures for some common things.
Other notes:
Others implementations (also feature names):
The OpenOffice Writer has an AutoCorrect feature which does this and additional things (out of the scope of this feature):
The Google docs have it under:
Edit: MS Word have those features split to AutoCorrect and Math AutoCorrect :
References:
While implementing a feature some questions camoe out:
Feature configuration namespace option A:
editor.config.set( 'textTransformation.transformations', [
{ from: '\\.\\.\\.', to: '…' },
] );
option B:
editor.config.set( 'typing.textTransformations', [
{ from: '\\.\\.\\.', to: '…' },
] );
I'm divided between those two: option B is when we're sure that the configuration will only consist one key while option A is more future-proof.
How to extend/mix/remove default configuration.
I'm pretty sure that whatever default configuration we figure out it not gonna suite everyone :) We can go with simple replace the config as other features. So defining new config will overwrite default. We could ease developers by exposing parts of default configuration as:
2a. Stings of groups (no import needed, some docs reading required)
ClassicEditor.create( foo, {
textTransformation: {
transformations: [
'symbols',
'dashes',
'quotes', // or `quotes_en_us`, 'quotes_pl`, etc
{ from: 'CKS', to: 'CKSource' }, // custom
]
}
}
2b. (may be additional to a) Name each default transformation and allow it as in 2a, ie, trademark
, quotes_en_us_primary
, emdash
, etc.
2c. Export constants (meh I wouldn't go there - requires import)
2d. Getter / method on TextTransformation
class (would require someone to write a plugin to extend a configuration so it also look like something weird).
Add support for to:
option to be a callback? The RegExp alone is doing fine but one might like to have some special rules. (also might be follow of a base feature).
As we should be able to configure a feature by either String
or RegExp
then I think it is better to enforce appending end of string token ($
) to strings from config.
const transformations = [
{ from: '\\.\\.\\.', to: '…' },
{ from: ' -- ', to: ' – ' },
{ from: ' --- ', to: ' — ' },
// etc
];
looks better then:
const transformations = [
{ from: '\\.\\.\\.$', to: '…' },
{ from: ' -- $', to: ' – ' },
{ from: ' --- $', to: ' — ' },
// etc
];
Rationale behind this is that we want to change typed text (thus at the end of string - before the caret) and not all text in a paragraph. Hope this makes sense. Can be overridden with RegExp.
Finally - what set of default transformations we would like to define?
It might be a follow-up as the feature should not rely on set of transformations. (ps.: to be tested with large set of text watchers - might be separate ticket also for TextWatcher alone).
I like the idea of making it a part of the typing
as it fits it very well. I think that even option B can be extensible while maintaining backward compatibility, so I wouldn't be worried about that.
Also note that typing
pretty much implies text, we don't have to call it textTransformations
but we can simplify it down to text.transformations
.
Future format (if needed) might look like:
editor.config.set( 'typing.transformation', {
transformations: [
{ from: '\\.\\.\\.', to: '…' },
],
getActive: () => myAmazingCms.user.wysiwyg.textTransformationActive
} );
The only thing I don't like about the above is two subsequent transformation
in the config chain.
Alternatively I'm thinking about API like that:
editor.config.set( 'typing.transformation', {
// Override default transformations set.
include: [
'dashes',
'quotes'
],
getActive: () => myAmazingCms.user.wysiwyg.textTransformationActive
} );
editor.config.set( 'typing.transformation', {
// Use the default set +trademark but without symbols.
exclude: [
'symbols'
],
extra: [
'trademark'
],
getActive: () => myAmazingCms.user.wysiwyg.textTransformationActive
} );
WDYT?
Names for both groups and individual transformation looks very convenient. So option 2b sounds like a way to go. We'll need to pay attention to well document these.
I don't see a need to implement this at this point. If this feature will be requested we'll add it in future releases.
Not sure about this one.
Can be overridden with RegExp.
Could you elaborate more on that?
I understand this as if someone wants to match stuff beyond the caret he or she can skip the $
using raw regexp. E.g.
const transformations = [
{ from: /\.\.\./, to: '…' }, // match beyond caret
{ from: ' -- $', to: ' – ' }
];
I'll address this later.
@mlewand:
Ad 4: Instead of pattern you can provide RegExp instance. But matching text inside paragraph will not work as we assume that change is done at the caret so the $
is somewhat obligatory.
The stuff with RegExp would work but only if we allow custom callback for to
option.
{ from /abc/g, ( wholeTextBeforeCaret ) => return doSomethingWith( wholeTextBeforeCaret ) }
but instead of /abc/g
there would be a more complex regex.
Ad1. The include/exclude/extra
is also an option to go. The only thing I don't get is the getActive()
callback.
I'm still not sure about that, I don't feel comfortable with doing magic stuff to provided regexp, in a sense that dev provides:
{ from /abc/g, ( wholeTextBeforeCaret ) => return doSomethingWith( wholeTextBeforeCaret ) }
And we convert it into:
{ from /abc$/g, ( wholeTextBeforeCaret ) => return doSomethingWith( wholeTextBeforeCaret ) }
As long as our documentation shows this in listing I think we're good with requiring tailing $
character in regexps.
For sure we can't input regular expressions as a string, because it will prevent us from adding simple string (literal) transformations, like the one you provided:
{ from: 'CKS', to: 'CKSource' }
regarding 1. Yea, I added this getActive()
option just as a dummy example how can it evolve later on 🙂
- I'm still not sure about that, I don't feel comfortable with doing magic stuff to provided regexp, in a sense that dev provides:
@mlewand no I don't wan't to convert RegExp objects - only the Strings which are regex patterns.
If someone would like to define:
{ from: 'CKS', to: 'CKSource' }
I'm convinced that we need to create a RegExp like this:
new RegExp( 'CKS$' );
so the text watcher (I mean the transformation) will trasnform:
Some CKS here and CKS there is CKS[]
to
Some CKS here and CKS there is CKSource[]
and not to:
Some CKSource here and CKSource there is CKSource[]
This makes sense because one can undo text transformation and thus do not need to worry of the erlier undone transformation to be transformed again.
So after F2F talk with @mlewand it indeed make no sense to mess with config as I described. The reason behind this was to simplify configuration but at the same time simple strings transformation such as cks -> CKSource
& (tm) -> ™
should be done on callbakcks internally and should be simple string.endsWith()
not full-blown RegExps.
We should only mention this in the docs.
Could you sum up, guys, what's the plan for this feature? How will the configuration look like? And how it will be triggered?
Sure. So if you'd like to fiddle with current state of this there's a PR which implements (use ckeditor5
branch for mgit co
as this covers multiple repos):
transformations are configured as:
typing: {
transformation: {
include: [
// configured by feature - set of defaults.
],
exclude: [
// remove transformations from defaults (see below to decide/implement)
]
extra: [
// user defined additional transformations
{ from: 'CKE', to: 'CKEditor' }, // string
{ from: /([a-z]+)@(example.com)$/, to: '$1.at.$2' } // regexp
]
}
}
Still to decide/implement:
exclude
and include
)For named configurations as see it as group name (ie. 'math'
or 'quotes'
) and individual (like 'quotes_primary_en'
or 'trademark'
.
For group stubs checkout currently implemented (very limited just to showcase groups):
const DEFAULT_TRANSFORMATIONS = [
// Common symbols:
{ from: '(c)', to: '©' },
{ from: '(r)', to: '®' },
{ from: '(tm)', to: '™' },
// Mathematical:
{ from: '1/2', to: '½' },
{ from: '<=', to: '≤' },
// Typography:
{ from: '...', to: '…' },
{ from: ' -- ', to: ' – ' },
{ from: ' --- ', to: ' — ' },
// Quotations:
// English primary.
{ from: buildQuotesRegExp( '"' ), to: '$1“$2”' },
// English secondary.
{ from: buildQuotesRegExp( '\'' ), to: '$1‘$2’' }
];
As you can see there are some rules:
' '
in some transformations that should happen only after user type space (mostly dashes).$
character to denote caret placement (to be described in the docs).@Reinmar and a showcase:
This feature recently was merged to the master
branch 🎉
For the usage API see the docs (or in case they're not deployed yet: the discussion above 🙂).
Sample screencast:
Would this be the correct plugin to handle things like:
Or
Or
If this would be the correct plugin, let's treat this issue as a feature request? Thank you.