plk / biblatex

biblatex is a sophisticated bibliography system for LaTeX users. It has considerably more features than traditional bibtex and supports UTF-8
512 stars 118 forks source link

number vs issue #726

Closed moewew closed 5 years ago

moewew commented 6 years ago

Cf. https://tex.stackexchange.com/q/418590/35864, https://github.com/retorquere/zotero-better-bibtex/issues/925, https://github.com/plk/biblatex-apa/issues/45

Currently the docs are quite strict about number being an integer. For the numbers as they appear in @articles that seems to be a bit too restrictive. There are at least two cases where one would put something other than a plain integer there

  1. Special issues identified with a letter and integer, e.g. 'S1' (https://github.com/plk/biblatex-apa/issues/45)
  2. Number ranges such as '2-3' (https://github.com/retorquere/zotero-better-bibtex/issues/925)

At the moment the documentation seems to suggest to use issue for non-integer input, and this is what Zotero BBT does. I find this unsatisfying since the output in the standard styles is significantly different when using issue as compared to number. It has also been established that for most intents and purposes number is the correct field for the subdivision of a journal volume.

If biblatex and Biber were to accept non-integer values for number the two cases above would easily give the expected output. Cautious style developers could use \ifnumerals, \ifnumeral or \ifinteger if they want to make sure the output does not end up looking stupid if they do anything special to the number field.

The only downside to this that I can see is sorting. If number is treated as a string, sorting pure integer-valued number fields might not give the expected output. But no standard sorting schemes sort by number...

There is even precedence for a non-integer number in biblatex-examples.bib

https://github.com/plk/biblatex/blob/3dfd95fb63628c2968d50164cf2efc3e73a76c01/bibtex/bib/biblatex/biblatex-examples.bib#L1494-L1507

and

https://github.com/plk/biblatex/blob/3dfd95fb63628c2968d50164cf2efc3e73a76c01/bibtex/bib/biblatex/biblatex-examples.bib#L1579-L1641

Here are three real-life examples that could benefit from number not being an integer

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@article{finkelstein2013,
  author       = {Amy Finkelstein and Erzo F. P. Luttmer and Matthew J. Notowidigdo},
  title        = {What Good is Wealth Without Health?},
  subtitle     = {The Effect of Health on the Marginal Utility of Consumption},
  journaltitle = {Journal of the European Economic Association},
  volume       = {11},
  number       = {Suppl. 1},% Suppl. 1/Supplement 1 would probably need an adjustment of the output format to look nice
  date         = {2013},
  pages        = {221–258},
  doi          = {10.1111/j.1542-4774.2012.01101.x},
}
@article{keels2013,
  author       = {Keels, Micere},
  title        = {Getting them enrolled is only half the battle},
  subtitle     = {College Success as a Function of Race or Ethnicity, Gender, and Class},
  journaltitle = {American Journal of Orthopsychiatry},
  volume       = {83},
  number       = {2-3},
  date         = {2013},
  pages        = {310-322},
  doi          = {10.1111/ajop.12033},
}
@article{fogliano2011,
  author       = {Fogliano, Vincenzo and Corollaro, Maria Laura and Vitaglione, Paola and Napolitano, Aurora and Ferracane, Rosalia and Travaglia, Fabiano and Arlorio, Marco and Costabile, Adele and Klinder, Annett and Gibson, Glenn},
  title        = {In Vitro Bioaccessibility and Gut Biotransformation of Polyphenols Present in the Water-Insoluble Cocoa Fraction},
  journaltitle = {Molecular Nutrition \& Food Research},
  volume       = {55},
  number       = {S1},
  date         = {2011},
  pages        = {S44-S55},
  doi          = {10.1002/mnfr.201000360},
}
\end{filecontents}

\addbibresource{\jobname.bib}

\begin{document}
\cite{finkelstein2013,keels2013,fogliano2011}
\printbibliography
\end{document}
retorquere commented 6 years ago

Given that people are already putting things like ranges in number, and biblatex already accepts non-numbers there, things wouldn't get worse by just making the docs reflect the implemented behavior, correct?

retorquere commented 6 years ago

(although I'd then have no way to decide in what field to put the zotero data, but that's tangential to this issue)

moewew commented 6 years ago

@plk What do you think about this? The only drawback to not requiring number to be an integer that I can see at the moment is sorting.

plk commented 6 years ago

In fact, the entire sorting system was overhauled in a major way to use a more efficient and typed algorithm precisely to accommodate integer sorting for these fields. The field type for sorting can be modified by the user in the sorting template but I would like to have a numeric field. I can perhaps make some changes to accommodate ranges (sorting on the first number in the range) but for arbitrary non-numeric parts, I think a new field would be better. Numeric sorting is important and having a free-form field again reverts to alpha sorting which impacts performance and is much less algorithmically clean.

retorquere commented 6 years ago

So does that mean the issue field is back in favor?

moewew commented 6 years ago

Mhhh, so this solitary drawback is quite a massive one. At the moment, however, no sorting scheme sorts by number, so practically this should not be too problematic.

I'm not too keen on a new field, since volume+number has been an established combination for very long, squeezing a new one in would need changes in many places and probably hamper adoption.

I definitely do not want people to put ranges in the issue field. And I think it would be great if things like S1 would also be considered OK in the number field.

plk commented 6 years ago

I see the point so it would seem as simple as changing the datatype of number from integer to literal in the default data model? Since nothing sorts on this by default, I see no reason not to change the default and let people who really want number to be numeric to restrict it by a custom data model?

moewew commented 6 years ago

That would be the best option in my book. Of course I would not mind if you looked into integer range sorting, but that would not solve the core problem here (plus I appreciate that you have better things to do ...).

retorquere commented 6 years ago

So what is left for the issue field then?

moewew commented 6 years ago

Good question. I would use issue only for subdivisions of a year, say "summer", "Michaelmas term", ... in that regard it's probably more like the season part of the date (hence the position). Whereas number is a subdivision of a volume. The only useful values for number are only integer ranges and a few special things like "supplement"/"special issue" plus possibly a number. I don't know if there is a good way to determine algorithmically where to put what.

retorquere commented 6 years ago

Unfortunately for me, determining algorithmically where to put what is exactly what I'd have to do. I'll probably do something like:

moewew commented 6 years ago

Mhhhh... Since I really don't like issue I'd go for

but really your choice is fine as well. There are only a few things that make sense here and everything else will give weird and unpleasant results regardless of what you go for. There may be better choices in specific cases for specific styles, but that is not something you should have to worry about.

retorquere commented 6 years ago

I can't ask the user -- Zotero preps the export and hands me the references to convert, no user interaction possible. It also doesn't have a separate field for seasons, so if the above collapses to "seasons go to issue, all the rest goes to number, I'm still stuck with detecting seasons.

Another option is to dump everything in number and have the users enter data using the cheater syntax if they want something in the issue field, but that would mean that data would not show up anywhere but in the biblatex export, which would mean double work for the user.

If this discussion goes too far of track for this repo, please do let me know.

moewew commented 6 years ago

Let's head back over to https://github.com/retorquere/zotero-better-bibtex/issues/925 to discuss this further.

moewew commented 6 years ago

https://github.com/plk/biblatex/compare/dev...moewew:numberint has hopefully all the necessary changes to turn number back into a literal.

moewew commented 6 years ago

https://github.com/plk/biblatex/pull/730 / https://github.com/plk/biblatex/commit/b2d9097c49722eda5d0582b1e323bbb6cb242935 have officially turned number into a literal field. The documentation no longer implies that only integer values are valid for number.

moewew commented 5 years ago

biblatex 3.12 has been released and is available in TeX live 2018 and MikTeX now. That means that the changes discussed here have made it to the release version of biblatex.

ThiloteE commented 2 years ago

If i may ask for your humble opinion,

Question:

Would it be conform with Biblatex, if Jabref were to somehow fetch the (article-) number, move it to the number field and move the issue-number from the number field into the issue field?

"Short" summary and description of the problem:

Since the issue field is mainly declared for seasons according to Biblatex standard, but publishers largely provide issue-numbers that are not only seasons but are integers, they put the issue-number into the number field, which is Biblatex conform, i think. Now i am unable to fathom to understand where ideally, the article-number should be put. I personally have not encountered any dataset yet, that put both the issue-number AND the article-number both into the number field at the same time.

Some publishers just abstain from providing it in Bibtex formated data. Some other publishers have now even started putting the actual article-number into the pages field. For the latter, I only can speculate that it is because the citation-style they prefer to use, only renders the issue-number and article-number at the same time, if there is no page-range present.

Additional info:

moewew commented 2 years ago

The whole issue vs number issue is a bit confusing.

Base BibTeX only has number and does not know an issue field. The BibTeX documentation btxdoc explains that

An issue of a journal or magazine is usually identified by its volume and number

and so in BibTeX you unambiguously use volume and number to specify the issue in which an article appeared (BibTeX was developed in the eighties, so this would be a printed issue we're talking about). The BibTeX standard styles print volume and number as

<journal>, <volume>(<number>)

biblatex added issue to the mix. The biblatex documentation had

This field is intended for journals whose individual issues are identified by a designation such as ‘Spring’ or ‘Summer’ rather than the month or a number.

and printed

<journal> <volume>.<number> (<issue> <year>)

Apparently, it was felt that the volume+number scheme alone was insufficiently flexible for all kinds of journal types. I find that the combination "<volume>.<number>" really only looks good when number is a number of a short alphanumeric designator, whereas the BibTeX "<volume>(<number>)" would also look OK-ish with slightly more complex number designators. So maybe that thought played a role. But maybe the new field was simply motivated by journals that traditionally don't use volume designations at all and just go with the publication year (Summer 2021 issue or 3/2021). In almost all real-world examples of @article entries that I have seen so far number was the best choice to represent the issue number.

Roughly speaking number subdivides volume and issue is much closer to subdividing year. I don't think I would want to say that issue is subordinate to number or vice versa. They sort of operate on a similar level.


But this is all from the good old 'we have a printed journal with page numbers' perspective. Once you throw electronic journals - where articles are identified via an article number and not a page range (within a printed issue) - into the mix, things get more interesting, because you get an additional number: the 'article number'.

In my opinion these article numbers should be rendered in pretty much the same position like page numbers, but they obviously should not be prefixed with "p."/"pp." or the like. The biblatex field for article numbers is eid. It is not particularly well known and I cannot guarantee that all contributed styles make sense of it (especially those following style guides, which may make no mention of article numbers). For a long time eid was only supported for @articles, but recently (https://github.com/plk/biblatex/issues/847, https://github.com/plk/biblatex/pull/1000) eid was added for all entry types, for which it makes sense.

Base BibTeX has no corresponding field (see also https://tex.stackexchange.com/q/445888/35864), so I can understand that publisher put this into the pages field. But that may not come out nicely in all situations.


To answer your question: Moving the issue number to issue and article number to number would not be my preference, because the issue number is traditionally number and the article number is eid in biblatex.

ThiloteE commented 2 years ago

Thank you so much! This made everything a little bit more clear.

I always wondered what eid is, but i never understood the explanation in the documentation and failed to associate it with article-number. I rarely had seen it being included in bibliographic data and since it is not the only type of ID that can be rendered via Biblatex (e.g. DOI, ISSN, Eprint,...) i thought it must be something quite exotic.

Good to know!

pauloney commented 2 years ago

It would be nice to see some of this text migrated to the documentation. It will be very useful to a lot of people.

PN

On Tue, Jan 11, 2022, 6:53 PM ThiloteE @.***> wrote:

Thank you so much! This made everything a little bit more clear.

I always wondered what eid is, but i never understood the explanation in the documentation and failed to associate it with article-number. I rarely had seen it being included in bibliographic data and since it is not the only type of ID that can be rendered via Biblatex (e.g. DOI, ISSN, Eprint,...) i thought it must be something quite exotic.

Good to know!

— Reply to this email directly, view it on GitHub https://github.com/plk/biblatex/issues/726#issuecomment-1010392332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR7WYQAMM3A6EBK7L33ETTUVSRHBANCNFSM4ET35IPQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

moewew commented 2 years ago

Comments on https://github.com/plk/biblatex/commit/308a69dcde7ee95dde94bd4882c95fb20a479c21 would be appreciated.

ThiloteE commented 2 years ago

With regard to 308a69d

moewew commented 2 years ago

Thanks for the comments.

https://github.com/plk/biblatex/commit/1d01aa1f460ff3fdd5231340c6b751747a024160

ThiloteE commented 2 years ago

Looks good!

Sorry, i must have misread issue-number.

"This field may replace the \bibfield{pages} field for journals deviating from the classic pagination scheme of printed journals by only enumerating articles or papers and not pages."

The above sentence could be replaced with:

This field may replace---or be used along with---the \bibfield{pages} field for journals deviating from the classic pagination scheme of printed journals.

I think this would be more coherent with your fine lines provided in 1094 and 1831, but as long as people understand. Yours is fine too 😁