retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.19k stars 284 forks source link

[Bug]: fields `ShortTitle` and `BookTitle` cannot be called as functions within citekey formula #2631

Closed spiritually-soup closed 11 months ago

spiritually-soup commented 1 year ago

Debug log ID

36EAXVYH-refs-apse/6.7.114-6

What happened?

Issue

Unable to make use of Zotero's Short Title and Book Title fields within citekey formula (BBT debug log probably not very useful here). When trying to use them like shorttitle(n,m) I get the error message field ShortTitle cannot be called as function.

Reason for wanting to use `Short Title` and `Book Title` fields

- be able to refer to the book title when citing book chapters - I use the `Short Title` field to shorten the actual `Title` in a way that makes the most sense to my brain; the `shorttitle` function doesn't always produce the most efficient short form. For book chapters, I previously used this field as a shortened chapter title, to distinguish chapter and book titles in the citekey

Work-arounds I've tried or come up with

- use `PublicationTitle(n,m)` ➔ same error message - `BookTitle.substring(n,m)` and `ShortTitle.substring(n,m)` ➔ this works and BBT allows it, but I would prefer to have full words, hence the `shorttitle(5,5).len('<',30)` syntax in my current formula. ➔ Plus, based on your documentation [here on your site](https://retorque.re/zotero-better-bibtex/citing/#configurable-citekey-generator) and [here on github](https://github.com/retorquere/zotero-better-bibtex/wiki/Citation-Keys/d92fd94311230690ed4d40f19623b0c1c568e699), we ought to be able to use them like the `Title` and `shorttitle` functions, right?

Current Citekey Formula

in code block

```javascript type(book) + (authEtal2(creator='author', sep='-').replace('.etal','-etal').lower || authEtal2(creator='editor', sep='-').replace('.etal','-etal').upper) + year.postfix(_) + (shorttitle(5,5).len('<',30) || shorttitle(4,4).len('<',30) || shorttitle(3,3).len('<',30) || shorttitle(2,2).len('<',30)) ; type(bookSection) + (authEtal2(creator='author', sep='-').replace('.etal','-etal').lower.len('>',0) + '-' + authEtal2(creator='editor', sep='-').replace('.etal','-etal').upper || authEtal2(creator='editor', sep='-').replace('.etal','-etal').upper) + year.postfix(_) + (shorttitle(5,5).len('<',30) || shorttitle(4,4).len('<',30) || shorttitle(3,3).len('<',30) || shorttitle(2,2).len('<',30)) ; type(journalArticle) + authEtal2(creator='author',sep='-').replace('.etal','-etal').lower + year.postfix(_) + (shorttitle(5,5).len('<',30) || shorttitle(4,4).len('<',30) || shorttitle(3,3).len('<',30) || shorttitle(2,2).len('<',30)) ; authEtal2(sep='-').replace('.etal','-etal').lower + year.postfix(_) + (shorttitle(5,5).len('<',30) || shorttitle(4,4).len('<',30) || shorttitle(3,3).len('<',30) || shorttitle(2,2).len('<',30)) ```

Formula by colour coded sections, if that helps (sans the `;` between item types).

image

Citekey Outputs using dummy items

| item type | desired citekey output format | current format using `shorttitle` | |--------|--------|--------| | book w. no editors | `abbot-adams2018_textfromShortTitle` etc. | `abbot-adams2018_textfromTitle` | | book w. editors and no authors | `EVANS-ETAL2018_textfromShortTitle` etc. | `EVANS-ETAL2018_textfromTitle` | | book section w. author | `abbot-adams-EVANS-EDWARDS2018_textfromShortTitle_textfromBookTitle` | `abbot-adams2018_textfromTitle` | | book section w/o author | `EVANS-ETAL2018_textfromShortTitle_textfromBookTitle` | `EVANS-ETAL2018_textfromTitle` | ^ abbot and adams are authors, evans and edwards are editors image


p.s. Could I check, are we supposed to be able to use field1.lower.postfix(-)? The workaround would of course be field1.lower + ‘-’ + field2, but I'm wondering if it's problem with my syntax.

The documentation is still a little confusing for someone not familiar with Java... btw, do let me know if you ever want some feedback from a layperson! for example, I think the addition of more sample syntax and syntax combinations (could be hidden under a toggle) would be really helpful and prevent unnecessary github tickets.

sorry if this is really long btw i figured more detail is better than less. do let me know if you need more info :)

retorquere commented 1 year ago

I won't be able to look into this until Monday. Tips on better documentation are always welcome, I'm not great at documentation.

retorquere commented 1 year ago

The debug log is most certainly relevant here. It gives me your BBT settings, among which the precise citekey formula, and a zotero item that will reproduce the problem as you experience it. And it comes structured in such a way that I can convert all this into a test case for my test suite, fully automated, without me copy-pasting or retyping anything.

retorquere commented 1 year ago

are we supposed to be able to use field1.lower.postfix(-)?

No, because a bare - is not allowed, it would have to be field1.lower.postfix('-'). Strings needs quotes, it's just that BBT will infer quotes for strings that happen to be valid javascript identifiers, so $, '$' and "$" mean the same thing,

retorquere commented 1 year ago

For the testcases from 36EAXVYH-refs-apse/6.7.114-6 I get

@book{abbot-adams2024_JustWarTheoryTimesIndividual,
@book{abbot-adams2024_ViolenceInternational,
@incollection{abbot-etal-EVANS-EDWARDS2024_DemocracyAccountabilityGlobal,
@article{abbot-etal2024_DemocracyAccountabilityGlobal,
@incollection{EVANS-EDWARDS2024_DemocracyAccountabilityGlobal,
retorquere commented 1 year ago

BBT knows three kinds of "things" to build the citekey from:

  1. what I currently call "functions", these produce text based on the item the key is being constructed from, eg shorttitle. Even though these are largely case insensitive, they must start with a lowercase letter.
  2. what I call "field access", direct text from the zotero item fields; these again are largely case insensitive, but they must start with an uppercase letter
  3. what I call "filters", these are actions that act on the text returned from either functions, field access, or from a subformula like (auth + title || year).lower. these are fully case insensitive, and you can chain these together, each acting on the output of the previous filter.

There are 3 ways you can build subformulae:

  1. simple bracketing: (auth + title)
  2. alternates: (auth || title)
  3. ternaries: (auth ? auth : title)

these can be combined into eg (auth || shorttitle || year ? auth + title : year || title), but subformulae cannot appear in parammeters, so title.select(auth ? 3 : 4) is not valid.

I'm open to better terminology for all of these. I've considered aspects or features for functions, but I'm not sure that's better.

Unable to make use of Zotero's Short Title and Book Title fields within citekey formula (BBT debug log probably not very useful here). When trying to use them like shorttitle(n,m) I get the error message field ShortTitle cannot be called as function

That means the first letter was capitalized, so something like Shorttitle(n,m), which makes BBT interpret it as field access, and then there can't be parentheses (which signify a function call).

The documentation is still a little confusing for someone not familiar with Java

I hope the above explains it better. The key formula functionality has grown more complex over the years, it started out as a clone of jabref's formula language, but as BBT's formulae got more featureful, that syntax was not feasible for the long term.

There are a few people that are fully comfortable with the formula syntax, which is great, but it's always OK to ask, either by opening an issue as you did for particular cases, or in the discussions section, where I can address more general questions and which doesn't require a debug log

spiritually-soup commented 1 year ago

Your explanation does make things clearer, thank you! I think "functions" works well, it makes sense to me at least.

I'll tweak my citekey to call ShortTitle and BookTitle asnwith a .len or something after, and that should fix things for me.

But just to double check ,my understanding: because functions require a lowercase start, it's not possible to make Zotero's Short Title field into a function, because that would conflict with the existing shorttitle function.

(I suppose you could change shorttitle into shortentitle but that would probably fck up a lot of people's citekey formulae, and I doubt there's sufficient demand for use of the Short Title field.)

Feedback for documentation:

1. I would actually suggest adding the entire bit you wrote (from `BBT knows three kinds...` to `parameters, so ... is not valid`) into the documentation somewhere near the start! Having them all together (as opposed to just incorporating them into the block of text explanation under each heading) helps make it clearer.

2. Wrote up a 'for dummies' explanation based on what I would personally find helpful. I have no idea if this is accurate though. _Like I said previously, I think that you could always hide additional info/explanation under a toggle, to keep the documentation easily readable but allow people to find the info they need... would be easier than an FAQ imo._

`title? title : auth).lower + year` - This is a ternary operator in the format `condition ? output_if_true : output_if_false`, and you can use it like an if-or statement. - These can be nested to evaluate multiple conditions like so: `condition1 ? value_if_true : condition2 ? value_if_true : condition3 ? value_if_true : fallback_function_to_use` — _idk if this is a bad idea, if it is you could indicate why?_ - This formula checks if the `Title` field exists (i.e. is not blank) . If it does, it uses the function `title` to produce the output string. Otherwise, it uses the `auth` function. It then converts the output to lowercase and appends the `year` field. `(title || auth).lower + year` - This formula uses the OR (`||`) operator. It checks whether the `title` function will produce a valid output string (i.e. not null, undefined, empty etc.). If it does, the `title` function is used; otherwise, the formula will use the `auth` function to evaluate. It then converts the output of `(title || auth)` to lowercase and appends the `year` field. `title.len + year | auth + year` - This method would have the formula evaluate whether the `title` field outputs a string of a certain length; if this condition is not met it jumps to the next formula `auth + year` - e.g. `title.len('>',0) + year | auth + year` checks whether the `title` output is greater than 0 (i.e. not blank) **note on all ‘author’ functions and their parameters** - In the table panel, Zotero displays all authors / editors / translators / collaborators etc. under a single ‘Creator’ field. Therefore, we use the `auth` function to reference all types of creators. - Unlike in jabref, `pureauth` and `edtr` fields are not supported - To specify which to select, use `auth(creator='kind of creator')`. - For multiple functions, use `auth(sep='characters')` to add separators between creators, e.g. - `authEtal2(creator='author', sep='.').lower` with 2 authors and 1 editor produces the output `firstauthor.secondauthor` - and `authEtal2(creator='editor', sep='-').lower` with 3 editors produces output `firsteditor-etal` - `auth` function - The first `n` (default: all) characters of the `m`th (default: first) author's last name, i.e. `auth(n,m)` . - _Can consider proving a sample syntax of how to incorporate the_ `initials` _parameter_ **re:** `shorttitle` - _Could indicate something like_ For how to restrict to a certain number of characters, see `substring` - _and then under_ `substring` _provide example_? - e.g. `title.substring(1, 10)` will only include the first 10 characters of the `title` function’s output. **re:** `len` - _Can provide an example of how to format `len` syntax, e.g._ `auth.len('>',0) | title + year` **re:** `postfix` - "`postfix('_')` will add an underscore to the end if, and only if, the value it is supposed to postfix isn't empty" - _NOTE ➔ documentation currently says .prefix(_) etc. so maybe add the `‘ ’` in_

retorquere commented 1 year ago

But just to double check ,my understanding: because functions require a lowercase start, it's not possible to make Zotero's Short Title field into a function, because that would conflict with the existing shorttitle function.

I don't understand what you are trying to say here. There are functions, which take various data from the item to produce output, and there is field access, which takes the field you want and just uses what it has. There is a function shorttitle which takes words from the title, and there is field access ShortTitle, which uses the field from the item. They don't conflict because either form is case insensitive except for the first letter; SHORTTITLE or Shorttitle does the same as ShortTitle, and sHORTTITLE and shorttitle do the same as shortTitle.

I would actually suggest adding the entire bit you wrote

That's already up there at https://retorque.re/zotero-better-bibtex/citing/#configurable-citekey-generator

These can be nested to evaluate multiple conditions like so: condition1 ? value_if_true : condition2 ? value_if_true : condition3 ? value_if_true : fallback_function_to_use idk if this is a bad idea, if it is you could indicate why?

I'd find that hard to read because it's ambiguous if you don't know the precedence rules (it's not ambiguous to BBT though); given

auth ? auth : title ? title : year ? year : 'x'

I could read that as

auth ? auth : (title ? title : year) ? year : 'x'

or (what you meant and how BBT will understand it)

auth ? auth : (title ? title : (year ? year : 'x'))

and when people would construct more complex rules from this I can't currently oversee whether there would be instances where some combination of ternaries, alternates and composition where the condition or the value_if_false of the ternary would grab what was meant to be an adjacent part of the formula.

This formula checks if the Title field exists (i.e. is not blank)

This is not strictly true; title is a function that filters out words like for or a from the item title, so if the item title is for a, title will be blank (false) where Title will not be blank.A

It checks whether the title function will produce a valid output string (i.e. not null, undefined, empty etc.)

functions always return text, which may be an empty text, they never produce null or undefined.

whether the title field outputs a string

that would be the title function, not the field, which would be Title.

Unlike in jabref, pureauth and edtr fields are not supported

I'm going to be pedantic here just for clarification, as we're trying to get this right for the docs; pureauth and edtr would be functions here, not fields, and as you meantion, BBT does have equivalents for these: auth(creator=author) and auth(creator=editor), respectively. BBT no longer targets Jabref-compatibility, I mention Jabref because that's where it started, but I don't want people to think of BBT formulae as "sort of Jabref patterns", because that will raise wrong expectations, so whether Jabref offers edtr or not is not how I want people thinking about BBT formulae. The relevant question would be "can I use auth, but only select editors".

For multiple functions, use auth(sep='characters') to add separators between creators

that would be multiple creators, right? Not multiple functions?

NOTE ➔ documentation currently says .prefix() etc. so maybe add the ‘ ’ in_

I didn't escape the _, fixed that now. Docs are building, should be up soon.

retorquere commented 1 year ago

If you don't mind, it'd be easier for my workflow if you just edited https://github.com/retorquere/zotero-better-bibtex/blob/master/site/content/citing/_index.md and the doc-comments in https://github.com/retorquere/zotero-better-bibtex/blob/master/content/key-manager/formatter.ts#L545. You can edit those online and github has tools to discuss parts of edits. It would also mean that once we have agreement I would click a button and they would be added to the docs automaically.

retorquere commented 1 year ago

Is there anything else I can help with?