Open GamerGirlandCo opened 1 year ago
I would also love a rule like this. Is there a way to at least use the Regex feature to insert a rule that finds and replaces the lowercase letters? This code here seems to be do that but I'm not sure how it can be implemented in the regex replace feature
It is not possible to use regex find and replace to capitalize the first letter in a sentence without some kind of other piece of logic like in the example code provided. JS does not support uppercasing a character via regex. So if this were to become a feature it would need to be a rule.
The amount of capture groups present makes me wonder if the regex is performant, but then again I have seen a lot worse looking regex that is decently performant.
My understanding is that the regex means the following, but I am by no means an expert:
The first capture group is made up of the following
It matches a period followed by 0 to 3 asterisks a whitespace character and then 1 to 3 asterisk: \.\*{0,3}\s\*{1,3}
Or a double quote or em dash (?) followed by 0 to 3 asterisks [“—]\*{0,3}
Or a double quote or em dash (?) or exclamation mark followed by a whitespace character: [“—!]\s
Or a period followed by 0 to 3 asterisks followed by 0 or 1 whitespace characters: \.\*{0,3}\s?
Or the start of the line is 1 to 3 asterisk: ^\*{1,3}
Or the start of the line has 0 to 3 asterisks followed by anywhere from 0 to 2 of either opening parentheses or double quotes followed by 0 to 3 asterisks: ^\*{0,3}[(“]{0,2}\*{0,3}
Or a closing parentheses followed by a whitespace character: \)\s
Or the start of a line: ^
Group 2 is any uncapitalized character: [a-z]
Group 3 is 0 or 1 number, letter, or underscore
I am assuming this regex was generated for a very particular scenario as it seems to leave off several different kinds of punctuation. It looks like it would be a good start for a rule like this, but I find it hard to see how this would work with the markdown syntax as it is.
Based on how error prone capitalizing a sentence can be I am little hesitant to try doing so in this case. However if I can better understand how we would avoid the problems that arise from having to try to determine sentence capitalization with markdown syntax in the mix, this could be something we move forward with.
Hmm I didn't realize how complicated sentence capitalization was. But honestly you don't have to start with an encapsulating rule that covers everything, conservative progressive development would be good enough IMO. I'll try to look if there's any open implementation of this on the web :D
Group 3 is 0 or 1 number, letter, or underscore
actually, the \W
in group 3 means "any non-word character" (i.e. anything besides letters, dashes -
and underscores _
).
I am assuming this regex was generated for a very particular scenario as it seems to leave off several different kinds of punctuation.
i have no problems adding more punctuation, if that's what you'd like! for example, i noticed now that question marks aren't part of the detected punctuation.
It looks like it would be a good start for a rule like this, but I find it hard to see how this would work with the markdown syntax as it is.
that's what the asterisk rules (\*{0,3}
et al) are for -- to make it work with markdown syntax.
Is Your Feature Request Related to a Problem? Please Describe.
i want a simpler way to convert a file that's in all lowercase to sentence case. this is because my writing style is very stream-of-consciousness oriented, and as such, i don't want to get bogged down with semantic things like capitalization in the moment.
Describe the Solution You'd Like
A linter step that capitalizes the first letter of every sentence.
Please include an example where applicable:
Describe Alternatives You've Considered
as of now, i use the obsidian regex pipeline plugin with rules generated by the following javascript code:
this outputs:
... and so on for every letter of the alphabet.
Additional Context
i did the work of putting together a snippet of code that does exactly what i want by putting the regexes from the above code into a capture group separated by
|
, with some slight additions and tweaks: