mlresearch / mlresearch.github.io

Machine Learning Research Homepage
Other
38 stars 36 forks source link

Use of `number` versus `volume` field in (generated) BibTeX output #7

Open Pseudomanifold opened 1 year ago

Pseudomanifold commented 1 year ago

Dear all,

Thanks for maintaining this awesome institution! I have one—maybe naive and nitpicky—question about the generated BibTeX output: it is my understanding that number should be used since we are dealing with a series. The official BibLaTeX manual on p. 24 mentions the following:

The number of a journal or the volume/number of a book in a series

Whereas for volume, the explanation is:

The volume of a multi-volume book or a periodical.

My question is now whether it would be possible to change this.

Let's take a recent entry as a running example:

@InProceedings{pmlr-v196-hacker22a,
  title =    {On the Surprising Behaviour of \texttt{node2vec}},
  author =       {Hacker, Celia and Rieck, Bastian},
  booktitle =    {Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022},
  pages =    {142--151},
  year =     {2022},
  editor =   {Cloninger, Alexander and Doster, Timothy and Emerson, Tegan and Kaul, Manohar and Ktena, Ira and Kvinge, Henry and Miolane, Nina and Rieck, Bastian and Tymochko, Sarah and Wolf, Guy},
  volume =   {196},
  series =   {Proceedings of Machine Learning Research},
  month =    {25 Feb--22 Jul},
  publisher =    {PMLR},
  pdf =      {https://proceedings.mlr.press/v196/hacker22a/hacker22a.pdf},
  url =      {https://proceedings.mlr.press/v196/hacker22a.html},
  abstract =     {Graph embedding techniques are a staple of modern graph learning research. When using embeddings for downstream tasks such as classification, information about their stability and robustness, i.e., their susceptibility to sources of noise, stochastic effects, or specific parameter choices, becomes increasingly important. As one of the most prominent graph embedding schemes, we focus on \texttt{node2vec} and analyse its embedding quality from multiple perspectives. Our findings indicate that embedding quality is unstable with respect to parameter choices, and we propose strategies to remedy this in practice.}
}

Example with volume

If I use volume, I get the following output:

image

Example with number

If I use number, on the other hand, I get the following output:

image

This output seems to be somehow cleaner and less clunky to me, but I admit it's

Example with @article

Notice that this only pertains to the @inproceedings entries; everything looks different of course with journal articles:

image

Summary

Sorry for this weird question—I would love to understand this phenomenon better; probably I am using BibLaTeX wrong... Feel free to close if this question makes no sense, of course!

lawrennd commented 1 year ago

Not weird at all, a great call out, thanks Bastian!

I think it makes sense to change this if that's the underlying definition. But it would be good to try and think of negative effects first ... I'm not sure what they are or might be ...

Pseudomanifold commented 1 year ago

Thanks a lot for the prompt response :-) One slight disadvantage of the field is apparently that it is treated like a literal, so it supports things like number = {S1} as well. This could (in the worst case) cause issues with sorting, but since PMLR entries also contain a year and date information, I cannot really think of a case where this would be problematic. Moreover, if multiple papers from different PMLR proceedings are cited side-by-side, I would assume that sorting first goes by author information anyway.

fsaad commented 1 year ago

The output behavior depends on the style file being used. Appendix B.2 of the LATEX book specifies that, for proceedings and inproceedings entry types, only one of volume or number is expected. If both fields are specified, then traditional .bst style files give preference to volume.

Since PMLR proceedings are formally released as "volumes" and the BibTeX uses proceedings and inproceedings, continuing to use volume appears to be the correct approach here. One risk of using the number field, for example, is that certain style files will render the output as `Proceedings of Machine Learning Research Number 196", which now generates an incorrect reference.

Pseudomanifold commented 1 year ago

Thanks—you are raising a good point! I understand that the output depends on the style file and indeed, with BibTeX and plain bibliography style, 'volume' and 'number' are directly part of the bibliographical entry. Hence, this problem appears to be largely related to the way BibLaTeX treats these fields. I wonder whether there is a way to find a good middle ground here.