platers / obsidian-linter

An Obsidian plugin that formats and styles your notes with a focus on configurability and extensibility.
https://platers.github.io/obsidian-linter/
MIT License
1.23k stars 81 forks source link

Bug: Capitalize Heading "First Letter" Does not Work on Words Starting with Double Quote #619

Open haint126 opened 1 year ago

haint126 commented 1 year ago

Describe the Bug

Setting: Enable Capitalize heading, style "First letter" Bug: If the first letter has an double quote (") before, Linter will capitalize the second word instead

How to Reproduce

Setting: Enable Capitalize heading, style "First letter"

Example to reproduce issue with Before:

# 1. "when there is a bug"

After:

# 1. "when There is a bug"
Linter config ``` json { "ruleConfigs": { "Escape YAML Special Characters": { "Escapes colons with a space after them (: ), single quotes ('), and double quotes (\") in YAML.": false, "Try to Escape Single Line Arrays": false }, "Format Tags in YAML": { "Remove Hashtags from tags in the YAML frontmatter, as they make the tags there invalid.": false }, "Format Yaml Array": { "Allows for the formatting of regular yaml arrays as either multi-line or single-line and `tags` and `aliases` are allowed to have some Obsidian specific yaml formats. Note that single string to single-line goes from a single string entry to a single-line array if more than 1 entry is present. The same is true for single string to multi-line except it becomes a multi-line array.": false, "Format yaml aliases section": true, "Format yaml tags section": true, "Default yaml array section style": "single-line", "Format yaml array sections": true, "Force key values to be single-line arrays": "", "Force key values to be multi-line arrays": "" }, "Insert YAML attributes": { "Inserts the given YAML attributes into the YAML frontmatter. Put each attribute on a single line.": false, "Text to insert": "aliases: \ntags: " }, "Move Tags to Yaml": { "Move all tags to Yaml frontmatter of the document.": false, "Tags to ignore": "", "Body tag operation": "Nothing" }, "Remove YAML Keys": { "Removes the YAML keys specified": false, "YAML Keys to Remove": "" }, "YAML Key Sort": { "Sorts the YAML keys based on the order and priority specified. Note: may remove blank lines as well.": false, "YAML Key Priority Sort Order": "", "Priority Keys at Start of YAML": true, "YAML Sort Order for Other Keys": "None" }, "YAML Timestamp": { "Keep track of the date the file was last edited in the YAML front matter. Gets dates from file metadata.": false, "Date Created": true, "Date Created Key": "created", "Date Modified": true, "Date Modified Key": "modified", "Format": "YYYY/MM/DD HH:mm:ss" }, "YAML Title": { "Inserts the title of the file into the YAML frontmatter. Gets the title from the first H1 or filename if there is no H1.": false, "Title Key": "title" }, "YAML Title Alias": { "Inserts the title of the file into the YAML frontmatter's aliases section. Gets the title from the first H1 or filename.": false, "Preserve existing aliases section style": true, "Keep alias that matches the filename": false, "Use the YAML key `linter-yaml-title-alias` to help with filename and heading changes": true }, "Capitalize Headings": { "Headings should be formatted with capitalization": true, "Style": "First letter", "Ignore Cased Words": true, "Ignore Words": "macOS, iOS, iPhone, iPad, JavaScript, TypeScript, AppleScript", "Lowercase Words": "via, a, an, the, and, or, but, for, nor, so, yet, at, by, in, of, on, to, up, as, is, if, it, for, to, with, without, into, onto, per" }, "File Name Heading": { "Inserts the file name as a H1 heading if no H1 heading exists.": false }, "Header Increment": { "Heading levels should only increment by one level at a time": true }, "Footnote after Punctuation": { "Ensures that footnote references are placed after punctuation, not before.": false }, "Move Footnotes to the bottom": { "Move all footnotes to the bottom of the document.": true }, "Re-Index Footnotes": { "Re-indexes footnote keys and footnote, based on the order of occurrence (NOTE: This rule deliberately does *not* preserve the relation between key and footnote, to be able to re-index duplicate keys.)": true }, "Convert Bullet List Markers": { "Converts common bullet list marker symbols to markdown list markers.": true }, "Emphasis Style": { "Makes sure the emphasis style is consistent.": true, "Style": "consistent" }, "No Bare URLs": { "Encloses bare URLs with angle brackets except when enclosed in back ticks, square braces, or single or double quotes.": true }, "Ordered List Style": { "Makes sure that ordered lists follow the style specified. Note that 2 spaces or 1 tab is considered to be an indentation level.": true, "Number Style": "ascending", "Ordered List Indicator End Style": "." }, "Proper Ellipsis": { "Replaces three consecutive dots with an ellipsis.": false }, "Remove Consecutive List Markers": { "Removes consecutive list markers. Useful when copy-pasting list items.": true }, "Remove Empty List Markers": { "Removes empty list markers, i.e. list items without content.": true }, "Remove Hyphenated Line Breaks": { "Removes hyphenated line breaks. Useful when pasting text from textbooks.": true }, "Remove Multiple Spaces": { "Removes two or more consecutive spaces. Ignores spaces at the beginning and ending of the line. ": true }, "Strong Style": { "Makes sure the strong style is consistent.": true, "Style": "asterisk" }, "Two Spaces Between Lines with Content": { "Makes sure that two spaces are added to the ends of lines with content continued on the next line for paragraphs, blockquotes, and list items": false }, "Unordered List Style": { "Makes sure that unordered lists follow the style specified.": true, "List item style": "-" }, "Compact YAML": { "Removes leading and trailing blank lines in the YAML front matter.": true, "Inner New Lines": true }, "Consecutive blank lines": { "There should be at most one consecutive blank line.": true }, "Convert Spaces to Tabs": { "Converts leading spaces to tabs.": true, "Tabsize": "4" }, "Empty Line Around Blockquotes": { "Ensures that there is an empty line around blockquotes unless they start or end a document. **Note that an empty line is either one less level of nesting for blockquotes or a newline character.**": true }, "Empty Line Around Code Fences": { "Ensures that there is an empty line around code fences unless they start or end a document.": true }, "Empty Line Around Tables": { "Ensures that there is an empty line around github flavored tables unless they start or end a document.": true }, "Heading blank lines": { "All headings have a blank line both before and after (except where the heading is at the beginning or end of the document).": true, "Bottom": false, "Empty Line Between Yaml and Header": false }, "Line Break at Document End": { "Ensures that there is exactly one line break at the end of a document.": true }, "Paragraph blank lines": { "All paragraphs should have exactly one blank line both before and after.": false }, "Remove Empty Lines Between List Markers and Checklists": { "There should not be any empty lines between list markers and checklists.": true }, "Remove link spacing": { "Removes spacing around link text.": true }, "Remove Space around Fullwidth Characters": { "Ensures that fullwidth characters are not followed by whitespace (either single spaces or a tab). Note that this may causes issues with markdown format in some cases.": false }, "Space after list markers": { "There should be a single space after list markers and checkboxes.": true }, "Trailing spaces": { "Removes extra spaces after every line.": true, "Two Space Linebreak": false }, "Space between Chinese Japanese or Korean and English or numbers": { "Ensures that Chinese, Japanese, or Korean and English or numbers are separated by a single space. Follows these [guidelines](https://github.com/sparanoid/chinese-copywriting-guidelines)": true }, "Force YAML Escape": { "Escapes the values for the specified YAML keys.": false, "Force YAML Escape on Keys": "" }, "Headings Start Line": { "Headings that do not start a line will have their preceding whitespace removed to make sure they get recognized as headers.": false }, "Remove Trailing Punctuation in Heading": { "Removes the specified punctuation from the end of headings making sure to ignore the semicolon at the end of [HTML entity references](https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references).": false, "Trailing Punctuation": ".,;:!。,;:!" }, "Empty Line Around Math Blocks": { "Ensures that there is an empty line around math blocks using `Number of Dollar Signs to Indicate a Math Block` to determine how many dollar signs indicates a math block for single-line math.": true }, "Move Math Block Indicators to Their Own Line": { "Move all starting and ending math block indicators to their own lines using `Number of Dollar Signs to Indicate a Math Block` to determine how many dollar signs indicates a math block for single-line math.": false }, "Remove Space around Characters": { "Ensures that certain characters are not surrounded by whitespace (either single spaces or a tab). Note that this may causes issues with markdown format in some cases.": false, "Include Fullwidth Forms": true, "Include CJK Symbols and Punctuation": true, "Include Dashes": true, "Other symbols": "" }, "Add Blockquote Indentation on Paste": { "Adds blockquotes to all but the first line, when the cursor is in a blockquote/callout line during pasting": false }, "Prevent Double Checklist Indicator on Paste": { "Removes starting checklist indicator from the text to paste if the line the cursor is on in the file has a checklist indicator": false }, "Prevent Double List Item Indicator on Paste": { "Removes starting list indicator from the text to paste if the line the cursor is on in the file has a list indicator": false }, "Proper Ellipsis on Paste": { "Replaces three consecutive dots with an ellipsis even if they have a space between them in the text to paste": false }, "Remove Hyphens on Paste": { "Removes hyphens from the text to paste": true }, "Remove Leading or Trailing Whitespace on Paste": { "Removes any leading non-tab whitespace and all trailing whitespace for the text to paste": false }, "Remove Leftover Footnotes from Quote on Paste": { "Removes any leftover footnote references for the text to paste": true }, "Remove Multiple Blank Lines on Paste": { "Condenses multiple blank lines down into one blank line for the text to paste": true }, "Auto-correct Common Misspellings": { "Uses a dictionary of common misspellings to automatically convert them to their proper spellings. See [auto-correct map](https://github.com/platers/obsidian-linter/tree/master/src/utils/auto-correct-misspellings.ts) for the full list of auto-corrected words.": false, "Ignore Words": "" } }, "lintOnSave": true, "recordLintOnSaveLogs": true, "displayChanged": true, "foldersToIgnore": [], "linterLocale": "system-default", "logLevel": 1, "lintCommands": [], "customRegexes": [], "commonStyles": { "aliasArrayStyle": "single-line", "tagArrayStyle": "single-line", "minimumNumberOfDollarSignsToBeAMathBlock": 2, "escapeCharacter": "\"" } } ```
Linter Logs ``` text Running linter Running Capitalize Headings Running Header Increment Running Move Footnotes to the bottom Running Re-Index Footnotes Running Convert Bullet List Markers Running Emphasis Style Running No Bare URLs Running Ordered List Style Running Remove Consecutive List Markers Running Remove Empty List Markers Running Remove Hyphenated Line Breaks Running Remove Multiple Spaces Running Strong Style Running Unordered List Style Running Compact YAML Running Consecutive blank lines Running Convert Spaces to Tabs Running Empty Line Around Blockquotes Running Empty Line Around Code Fences Running Empty Line Around Math Blocks Running Empty Line Around Tables Running Heading blank lines Running Line Break at Document End Running Remove Empty Lines Between List Markers and Checklists Running Remove link spacing Running Space after list markers Running Space between Chinese Japanese or Korean and English or numbers Running Trailing spaces Running Custom Regex Running Custom Lint Commands ```

Expected Behavior

If the first letter has an double quote (") before, Linter should still capitalize it

Expected output if applicable:

Before:

# 1. "when there is a bug"

After:

# 1. "When there is a bug"

Device

It happens on Windows laptop

Thanks

pjkaufman commented 1 year ago

I am not sure if I would consider this a bug since technically speaking, "When is not a word so the logic is properly ignoring it in the word list. I can probably add in an exception for single and double quotes if we really want to add support for this.

haint126 commented 1 year ago

I would really appreciate it if you could add exceptions for this setting. I suggest to create a list of ignore symbols in case someone wants to add other symbols like: (, {, … Thank you very much

redactedscribe commented 1 year ago

A definable list of characters to ignore before checking for the first word might be the needed approach. You can hardcode an exception for the standard ' and " for example, but what about other symbols such as the typographical opening single and double-quote equivalents, & , or leading punctuation used by other languages? I'd say the four symbols I've mentioned should be the minimum.

sevmonster commented 4 months ago

This is still an issue. This also affects fancy apostrophes in words. Example:

# it’s bad this happens

becomes

# it’s Bad this happens

And words that are already cased properly give this funky output:

# It’s Bad this happens

There should also be an option to ignore inline code blocks, so that lines like this don't get changed:

# `code` doesn’t work

Instead, this happens:

# `code` doesn’t Work

The same is true for links, as:

# [hello](world) foo

becomes

# [hello](world) Foo

If I were a maintainer, I would have a completely separate code path for first letter detection, instead of trying to combine it with the Title Case code as it is. I would use a regex like this to detect first letter (2nd capture group), and uppercase it:

/^([ \t]*#+[ \t]+(?:\[[ \t]*)?['‘’‚'"“”„"〝<<‹«»›>(([[{{「「『〈⟨《⟪〔【.…⋯᠁-–—~-ーー~¡¿§¶ \t]*)(\p{L})/

Seems a bit overkill in hindsight... Might be a better way to do this that doesn't involve trying to outsmart the writer. I wanted to account for all typographic marks one might use for quotation, emphasis, question, section demarcation, etc., without being greedy and capitalizing where one shouldn't. Potential unlikely edge cases not handled here are escaped tags (\#tag) and using the hash as a typographical mark, as I did not want to erroneously match #(t)ag as first letter. Might be more.

pjkaufman commented 3 months ago

@sevmonster , could you explain some more about the what you mean by the following?

There should also be an option to ignore inline code blocks, so that lines like this don't get changed: # `code` doesn’t work Instead, this happens: # `code` doesn’t Work

I am not sure if you are expecting the headers that start with a code block to be skipped or if the intent is to capitalize the first word after the code block.

sevmonster commented 3 months ago

The intent is for anything starting with a code block to be skipped, as the code block is (or would likely be) a variable or other word, that semantically is the start of the sentence, so no further capitalization is necessary.

For a live example, I have a header like this:

# `root` vs. `alias`

In this instance, root is a NGINX directive and is intended to be part of the sentence. But neither it (as a piece of case-sensitive code) nor vs. (as a word in the sentence that hasn't started it) needs to be capitalized.