textlint-rule / textlint-rule-sentence-length

textlint rule that limit maximum length of sentence.
MIT License
7 stars 3 forks source link

Feature request: Exclude URLs even in plaintext #15

Closed hata6502 closed 3 years ago

hata6502 commented 3 years ago

I'd appreciate it if exclude URLs even in plaintext. :pray:

Input: test.txt (plaintext)

https://example.com/longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong

Current behavior:

  1:1  error  Line 1 sentence length(104) exceeds the maximum sentence length of 100.
Over 4 characters  sentence-length

Requested behavior: Exclude URLs even in plaintext, and the error is not occured.

azu commented 3 years ago

In plain text, textlint can not detect the URL-like string as a Link node. This task should be in parser/textlint plugin.

https://github.com/textlint-rule/textlint-rule-sentence-length#options Can you resolve this issue with exclusionPatterns option?

If you means that adding "Except Link", we need to add new option like skipLink.

hata6502 commented 3 years ago

@azu Thank you! I resolved with exclusionPatterns.

{
  ruleId: 'sentence-length',
  rule: textlintRuleSentenceLength,
  options: {
    // CC BY-SA 4.0
    exclusionPatterns: ['/https?:\\/\\/(www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{1,256}\\.[a-zA-Z0-9()]{1,6}\\b([-a-zA-Z0-9()@:%_\\+.~#?&//=]*)/'],
  },
}

Regex from Stack Overflow. Asked by bigbob, answered by Daveo.


But... this regex is tiny complex, and excluding URLs is a general usage. Would you implement this feature with url-regex-safe by default? I think that skipLink is not needed because this feature can be enabled by default.

azu commented 3 years ago

It is case by case.

I want to count the following case.

This is [example](https://example.com/longlonglonglonglonglonglonglong).

I agree that skip counting the following case. However, some users want to count it. So, It should be an option. (It is an actually long text. it is reasonable that want to detect it.)

This is <https://example.com/longlonglonglonglonglonglonglonglonglonglong>.

Logics:

If the above logic going to be implemented, .txt still shows the same error. because, @textlint/textlint-plugin-text does not parse URL-like string as Link node.

I disagree that the rule uses URL Regexp because I do not want to implement the same URL Regexp in various rules. This task should be implemented in the plugin/parser.

azu commented 3 years ago

Implement https://github.com/textlint-rule/textlint-rule-sentence-length/issues/15#issuecomment-822510521 as skipUrlStringLink. skipUrlStringLink is true by default in https://github.com/textlint-rule/textlint-rule-sentence-length/releases/tag/v3.0.0