lukeleppan / better-word-count

Counts the words of selected text in the editor.
MIT License
257 stars 40 forks source link

Feature Request: Counting Citation without Brackets #81

Open Liong1976 opened 1 year ago

Liong1976 commented 1 year ago

First, thank @lukeleppan and @chrisgrieser for making the latest update that allowed this plugin to count Footnotes and Citations.

As I mentioned in the Discord, I use citations without brackets, @JOETheEffect1994, instead of [@JOETheEffect1994]. The plugin only shows that I have 1 citation, even though I have many more than one.

image

The format of the only citation that was counted was like this:

^[@JOETheEffect [p. 314]]

Mostly, I put citations in inline footnotes that like this:

^[Plura mihi bona sunt, inclinet, amari petere vellent. Ab illo tempore, ab est sed immemorabili. Ullamco laboris nisi ut aliquid ex ea commodi consequat. See @JOETheEffect [p. 314].]

I wonder if this plugin also can count citations without brackets.

I appreciate your help.

lukeleppan commented 1 year ago

All praise should go to @chrisgrieser.

So I have never used footnotes or citations in markdown but I have looked at the possible formats that @chrisgrieser implemented.

Firstly the footnote formats (<> is not part of it):

Secondly the citations formats (<> is not part of it):

So citations don't have to be in brackets. Also are you sure about those square brackets around the page number because obsidian seems to consider it a link, at least in Live Preview, probably (p. 214) would be better but it works either way.

It looks like you can make yours citations work by putting a comma after like ^[Plura mihi bona sunt, inclinet, amari petere vellent. Ab illo tempore, ab est sed immemorabili. Ullamco laboris nisi ut aliquid ex ea commodi consequat. See @JOETheEffect, [p. 314].]

Please let me know if I got anything wrong or if you believe this behaviour should change in anyway.

Liong1976 commented 1 year ago

Hi @lukeleppan,

Thanks for your response.

First, I don't get your explanation about using <> because I don't use those symbols in my footnotes. Also, for my case, I don't see any issues with the footnote counting.

Second, if I add the comma after the citation key, Pandoc will not render it correctly,

Pandoc is supposed to render as:

See Joe Donald, The Effect of Consumerism (New York: Penguin, 1994), 314.

With the comma, Pandoc will render as:

See Joe Donald, The Effect of Consumerism (New York: Penguin, 1994), [p. 314].

chrisgrieser commented 1 year ago

Hey, thanks for the kind words, both you you 😊

As @Liong1976 says, citations without brackets are indeed valid. However, the reason why I didn't include them in the regex is that a lot of plugins use some syntax with @ – for example Natural Language Dates uses things like @today. To avoid counting all those into the citation count, I wrote the regex to only count citations in square brackets.

Like, since @something isn't part of the markdown standard, it's used for many different use cases, meaning there is no perfect solution for this. I think a reasonable approach would be to add a setting and let the user decide for themselves whether they want to count citations without brackets (and potentially having false positives due to other plugins) or only count citations with brackets.

Liong1976 commented 1 year ago

Hi @chrisgrieser,

Thanks for your response.

I understand why you didn't make this plugin include citations without brackets.

However, your proposal, letting the user choose whether they want to use brackets for the citations, looks good to me.

FeralFlora commented 1 year ago

@chrisgrieser In addition to the case of citations without brackets being discussed here, I'd like bring the issue with dots ([@cite.key]) not being recognized, which was raised by @Gewerd-Strauss in the PR in https://github.com/lukeleppan/better-word-count/pull/79#issuecomment-1492152966, into this discussion.

In my first document, Pandoc Reference List counted 14 citations, and Better Word Count only counted 4. The only problematic cases I found were the lack of brackets already discussed and then many cases of citekeys with dots in them.

chrisgrieser commented 1 year ago

Changing what is considered as a citekey I can do easily. Problem is rather, that with a less strict definition, there are as outlined above a lot of potential incompatibilities with other plugins.

Adding the settings to customize what should be regarded as citekey would require adding some stuff to the settings UI, which I am not familiar with, that's something @lukeleppan (or someone else familiar with svelte) will have to implement.

In the meantime, it would be really helpful, if someone knows any authoritative source which characters exactly are valid in a citekey. I have looked for something like that more than once, but haven't really found something definitive.

FeralFlora commented 1 year ago

In the meantime, it would be really helpful, if someone knows any authoritative source which characters exactly are valid in a citekey. I have looked for something like that more than once, but haven't really found something definitive.

See the Pandoc docs on Citation syntax. That's as authoritative as it gets: https://pandoc.org/MANUAL.html#citation-syntax

And perhaps also this discussion on the Pandoc repo: https://github.com/jgm/pandoc/issues/6026

chrisgrieser commented 1 year ago

okay, so it seems is the most accurate information you can get on the topic, which should be implemented by default

Unless a citation key starts with a letter, digit, or _, and contains only alphanumerics and single internal punctuation characters (:.#$%&-+?<>~/), it must be surrounded by curly braces, which are not considered part of the key. In @Foo_bar.baz., the key is Foo_bar.baz because the final period is not internal punctuation, so it is not included in the key. In @{Foo_bar.baz.}, the key is Foo_bar.baz., including the final period. In @Foo_bar--baz, the key is Foo_bar because the repeated internal punctuation characters terminate the key. The curly braces are recommended if you use URLs as keys: [@{https://example.com/bib?name=foobar&date=2000}, p. 33].

so not all examples shared here are actually valid citekeys. Nevertheless, the specification is quite a bit more complicated than I expected with all those rules concerning punctuation; so implementing it might be a bit tricky (or at least requiring some scripting in addition to pattern matching)

FeralFlora commented 1 year ago

It might be easier to just require a .bib or .json bibliography file for this feature, and then just check the keys against that. Any citekey in the file is recognized regardless of syntax. This would be similar to how Pandoc Reference list works. There could even be some functionality convergence here.