retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.27k stars 284 forks source link

System, language and other not exported on software type #907

Closed RicardoGuzmanVelasco closed 3 years ago

RicardoGuzmanVelasco commented 6 years ago

I have the last Zotero version and I'm working without problems, but some fields are not being exported. The critical of them are in software type, where system and programming language aren't exported. I'm using BibLaTex to export (obviously BetterBibLaTex in fact, but both of them don't export these fields) and I have not these fields in export filter on preferences. Other fields of software type, as location, are exported normally, I think.

About system, no idea about how to export it on BibLaTex format.

About programming language, I read that language is only mapped to genre for computerProgram, but genre field isn't being exported neither.

P.S.: sorry bad writing (I'm not English speaker).

retorquere commented 6 years ago

Don't worry about the English, it's perfectly clear what you are requesting. Do you have a suggestion for which fields to export these to? @njbart, any opinion on the matter?

RicardoGuzmanVelasco commented 6 years ago

Maybe system could be exported to location (I think location is a useless field on software, or I can't figure out what function could it have), but the problem is that Zotero has the location field too, so it will generate conflicts if both of them are not empty.

What about programming language exporting to language itself? Could it become a conflict source to BibLaLaTeX?

Besides, I think programming language and specially system are list properties more than fields, but unfortunately Zotero map it as simple fields. It is understandable with programming language, as it is related to "language" base field, but on system... (that, otherwise, has not use on bib exporting, so...). It is a common problem on Zotero – you can see publisher field, which the .bib format handle as a list.

So it is difficult to me (noob on LaTeX world) figure out what mapping could be a hit!

retorquere commented 6 years ago

I don't know much more about it either. It's pretty easy to experiment with though in a postscript, something like

if (Translator.BetterBibLaTeX && this.item.itemType === 'computerProgram') {
   // set the 'language' field to the programming language
   this.add({ name: 'language', value: this.item.programmingLanguage })
}

or

if (Translator.BetterBibLaTeX && this.item.itemType === 'computerProgram') {
   // add programminglanguage and system to the keywords
   this.add({ name: 'keywords', replace: true, value: this.item.tags.concat([this.item.programmingLanguage, this.item.system ]) })
}

I have no idea how the respective bib(la)tex processors would react to having the language field set this way (which is why I hailed @njbart, if someone knows, it's him). You would have to experiment. But before I make fields mappings fixed for everyone, I'd want feedback from @njbart.

retorquere commented 6 years ago

I'm not going to default to exporting the system field to location; the biblatex manual says:

The place(s) of publication, i. e., the location of the publisher or institution, depending on the entry type.

This is clearly not what system means, and the place field of Zotero is the obvious counterpart to location. I know fields sometimes get abused just so their contents show up in the bibliography, but what I do in the default mapping affects everyone, so I don't want to introduce clear semantic mismatches. This is precisely what I added the postscript facility for, so these workarounds could be done without affecting everyone.

I'm still interested in adding some kind of mapping for these fields, but it can't be location.

RicardoGuzmanVelasco commented 6 years ago

That first postscript works well. Second one not in my opinion, because is difficult to handle keywords related to specifically system or language. I'm revisiting documentation and fields... I don't know either where fit system field “lawfully”.

One question (probably noob stupid question), why not export it as “system” field itself?

retorquere commented 6 years ago

It's a legitimate question, but when there is no documented (or in the case of bibtex, commonly used) field, I don't want to make this the default for everyone; it opens the floodgates to make every non-standard field request by the first user to ask for it the default for everyone, and when that inevitably clashes somewhere (one wants this, other wants that), I have to add another preference to steer the behavior... not a road I want to go down (anymore). There'd be a better case for mapping these to customa - customf fields.

RicardoGuzmanVelasco commented 6 years ago

The fact is that I can not understand what that Zotero's system field pretend to be, knowing that it is not exported to any field. Difficult business here.

njbart commented 6 years ago

So, the question is, what to do with Zotero’s GUI fields “System” and “Language” when encountered in the “Computer Program” item type, right?

I’m afraid there aren’t any particularly attractive matches: “System” → type seems more or less ok, but for “[Programming] Language“, I’m at a loss. In a pinch, titleaddon or addendum might work, sort of. I’d agree that “System” shouldn’t be mapped to location, and neither should “[Programming] Language“ be mapped to language (which, I feel, is best reserved for indicating the natural language(s) a work is written in – info that could very well appear independently of any details about programming language(s)).

retorquere commented 6 years ago

It must serve some purpose; the Google Play and PyPI import the system field.

retorquere commented 6 years ago

@njbart, yes, correct, that is the question.

retorquere commented 6 years ago

I know Zotero maps the language field to "genre" for CSL, so I suppose that gets used in some CSL styles.

retorquere commented 6 years ago

(but genre gets no mention in the biblatex manual)

retorquere commented 6 years ago

internally, Zotero computer program references have no language field, only a programmingLanguage field (which is changed to genre if it is handed to citeproc); it's the only reference type in Zotero that doesn't have a language (indeed meant to be natural language) field. That alone should make clear it's not a good idea to put it in the biblatex language field -- it has a very different meaning despite its similar-sounding name.

retorquere commented 6 years ago

@RicardoGuzmanVelasco do you know of any article that cites software including OS and programming language?

RicardoGuzmanVelasco commented 6 years ago

No, I don't.

retorquere commented 6 years ago

I will need to see a style that actually uses these fields. If there's no documented fields for this, and there's no style that uses them, the choice of fields would be fairly arbitrary, and that approach has gotten me in trouble in the past where down the line I found conflicting needs.

retorquere commented 6 years ago

If there's no actual use-case and there's no documentation I can't really do anything.

microniko commented 4 years ago

Hello, I hope I can help.

Example with APA (copy from Zotero) :

Rampin, R., Steeves, V., et DeMott, S. (2019). Taguette (Version 0.9) [Python, GNU/Linux]. https://doi.org/10.5281/zenodo.3551632

But language and system are not in the BibTeX export generate with Better BibTeX :

@software{rampinTaguette2019,
  title = {Taguette},
  author = {Rampin, Remi and Steeves, Vicky and DeMott, Sarah},
  date = {2019-11-23},
  location = {{New York}},
  url = {https://zenodo.org/record/3551632},
  urldate = {2020-04-28},
  abstract = {A spin […].},
  file = {…},
  keywords = {document,highlights,notes,qual,qualitative,qualitative research,research,tagging,tags,text-analysis},
  organization = {{Zenodo}},
  series = {Collection},
  version = {0.9}
}

In the BibTeX documentation, @software seems to be an alias for @misc. Indeed, there is no system and language fields in the BibTeX type @misc. But there is one field which is not used : type. Is it possible to use this field for system or language ?

I tried it with pandoc-citeproc (with type field) and it worked.

Thx

retorquere commented 4 years ago

I mean possible, yes, but how pandoc handles such things is an unsure metric for this; if you feed pandoc bibtex, it translates it back to CSL internally before it generates a bibliography, and I don't know what mapping pandoc uses, or on what rationale it was based. How other bibtex-consuming programs would react would still be undetermined.

If there's a bibtex style (so a bst style) that uses this we'd know more.

rdicosmo commented 4 years ago

There is now a biblatex style extension that defines entries specifically for software, see issue #1527 This can serve as a basis for the Zotero import/export

retorquere commented 4 years ago

I'm not sure how style extensions work, so I don't know whether it's reasonable to expect that most users could have this available.

Can either one of you get me a debug-log by right-clicking one or a few items that you want to see transformed to @software? I need to add these as test cases.

microniko commented 4 years ago

@retorquere I did, ReportID : 2NR2NQLG-euc (I don't know where I can find this in GitHub)

retorquere commented 4 years ago

The ID suffices, thanks -- the debug logs aren't visible on GH.

As per #1527, there's 4 entry types for software now -- which would you expect to see?

microniko commented 4 years ago

The right entry is probably @software.

retorquere commented 4 years ago

Looking at the docs that @rdicosmo points to, I don't see candidates for fields for the programming language or system. Can you put together what you'd prefer @rampinTaguette2019 should look like in biblatex format?

microniko commented 4 years ago

You're right. These fields are in Zotero. I don't know what the developers used to decide which fields will be present in Zotero (CSL specifications probably). Unfortunately, there is no match between BibLaTeX and Zotero :\ (Cc: @rdicosmo) Can you merge these 2 fields and put it in the note field ? Perhaps, this is the least bad idea.

retorquere commented 4 years ago

I can help you do that with a postscript (which can add it to the note field or any separate fields you could think of), but I'm not baking into BBT to do that by default, sorry.

retorquere commented 4 years ago

I must say I'm surprised these fields don't appear in those new @software types. You may want to contact those developers.

microniko commented 4 years ago

I can help you do that with a postscript (which can add it to the note field or any separate fields you could think of), but I'm not baking into BBT to do that by default, sorry.

Thanks, but I did it manually ;-)

I must say I'm surprised these fields don't appear in those new @software types. You may want to contact those developers.

I am agree. I cannot create an issue on the Inria Gitlab. Perhaps, @rdicosmo can do it ;-)

retorquere commented 4 years ago

You can, but you need to create an account: https://gitlab-account.inria.fr/

microniko commented 4 years ago

You can, but you need to create an account: https://gitlab-account.inria.fr/

Oups, yes !

rdicosmo commented 4 years ago

Thanks for bringing this issue to my attention, I'll try to provide input here as it may be of general interest.

The Inria/Software Heritage working group that led to biblatex-software has discussed in depth what we needed for software citation, and concluded that we do not need to represent in a BibTeX entry all the possible metadata for software: the idea is to keep at hand everything needed for producing a proper citation in all contexts of interest, and rely on a link to an external source for the rest. As an example, the affiliations of the authors are an important part of the software metadata, like they are for article metadata, but we do not want to show them in a citation, so we do not store them in a BibTeX record. You can find the full position statement at https://gitlab.inria.fr/gt-sw-citation/bibtex-sw-entry/-/blob/master/README.md

Considering the above, it was decided to include the license in the list of fields that will be rendered in the bibliographies produced by biblatex-software, but not other informations like programming language, operating system, dependencies, category etc.

Notice though that BibTeX format is extensible, so it is definitely possible to add such fields in a BibTeX file, independently of the fact that bibltatex-software renders them or not.

The key issue, as @retorquere rightly points out, is to make sure there is consensus on the names and meaning of these fields, to avoid problems later on.

If there is a clearly motivated need for adding support for rendering some other information, I'll be more than happy to run this through the expert group, and include it in a next version of biblatex-software once a consensus emerge.

retorquere commented 3 years ago

Has such a consensus been reach in the interim?

rdicosmo commented 3 years ago

I did not see any news on this, but I did a quick scan of all .bst and .bbx files included in TeXLive for "programming" and "system", and the only relevant result is the entry "systemreq" used in the (russian) biblatex-gost package.

Hence it seems there is no existing popular bibtex style taking these entities into account.

If one were to introduce new fields for this use case, one could piggyback on the terms used in the CodeMeta standard: https://codemeta.github.io/terms/, notably, system --> runtimePlatform language --> programmingLanguage

retorquere commented 3 years ago

I'm ok with doing this in the interim, but it doesn't look like codemeta targets bib(la)tex really, but I'd output them all-lowercase, as the skipfields feature relies on it.

rdicosmo commented 3 years ago

but I'd output them all-lowercase, as the skipfields feature relies on it. Sure!

retorquere commented 3 years ago

I'd need a new support log to get a sample to translate.

rdicosmo commented 3 years ago

I'd need a new support log to get a sample to translate.

@microniko : would you be willing to do that?

microniko commented 3 years ago

I'd need a new support log to get a sample to translate.

@microniko : would you be willing to do that?

I will be happy to help but what do you mean in support log ? How can I do that ?

retorquere commented 3 years ago

Assuming you have BBT installed, right-click the items you'd want to see exported differently, and select BBT-send BBT support log from the popup menu.

microniko commented 3 years ago

Sorry for delay… I had problems with Zotero…

The support-log # is : 6HUGQWIY-refs-euc

retorquere commented 3 years ago

@njbart, any opinion on this? Is CodeMeta a better pick than CSL equivalents (if those exist)?

njbart commented 3 years ago

Is CodeMeta a better pick than CSL equivalents (if those exist)?

I for one wouldn’t recommend introducing terms from yet another standard.

If no sensible way of mapping something to a documented bib(la)tex field can be found, I’m fine with the idea of repurposing CSL variable names as nonstandard bib(la)tex field names (possibly with a csl_ prefix), without which users wouldn’t get the chance to ultimately remap these to anything they see fit – but I would draw the line at using names that stem from neither bib(la)tex nor CSL.

retorquere commented 3 years ago

I would agree on that.

Can bib(la)tex entries have comments between the fields? Can dashes legally be in field names?

njbart commented 3 years ago

Can bib(la)tex entries have comments between the fields?

I’m not sure. Comments between entries seem to be ok. (Doesn’t BBT use comments already for its [sort of] health report?)

Can dashes legally be in field names?

Just tried both dashes and underscores (with biblatex only), seems to work ok. (Including using \DeclareSourcemap to remap, e.g. tit_le to title.)

retorquere commented 3 years ago

Can bib(la)tex entries have comments between the fields?

I’m not sure. Comments between entries seem to be ok. (Doesn’t BBT use comments already for its [sort of] health report?)

Yep; technically it's not even necessary to add the % character at the start of the line as simply anything at all is allowed between entries, but I'm sure there are naive parsers that expect the % signs as comments there. Between the lines I'm not sure; actual parsers may be lenient enough to allow it, but I think my own parser doesn't.

Can dashes legally be in field names?

Just tried both dashes and underscores (with biblatex only), seems to work ok. (Including using \DeclareSourcemap to remap, e.g. tit_le to title.)

So underscores rather than dashes is the preferred choice?

rdicosmo commented 3 years ago

Just tried both dashes and underscores (with biblatex only), seems to work ok. (Including using \DeclareSourcemap to remap, e.g. tit_le to title.)

So underscores rather than dashes is the preferred choice?

I would avoid underscores if possible: in (La)TeX they require math mode ($ or $$), which is an annoyance when writing documentation with examples taken from a bib file.

retorquere commented 3 years ago

I would avoid underscores if possible: in (La)TeX they require math mode ($ or $$), which is an annoyance when writing documentation with examples taken from a bib file.

Those should really be put in a verbatim environment.

njbart commented 3 years ago

Just another thought: Instead of outputting possibly problematic bib(la)tex field names, would it be possible for BBT to generate bib(la)tex output from certain Zotero GUI fields only if users actively request, via a BBT postscript, mapping from whatever name BBT is using for these internally (whether GUI field names or CSL variable names) to whatever users need in the context of the particular bib(la)tex package they are using?

The default would be for BBT not to output anything from such Zotero GUI fields, but it could possibly add something to its health report included in the bib(la)tex output, pointing out that certain Zotero GUI fields were found to be non-empty, but that actual output from these can only be configured by using a BBT postscript.

retorquere commented 3 years ago

Just another thought: Instead of outputting possibly problematic bib(la)tex field names, would it be possible for BBT to generate bib(la)tex output from certain Zotero GUI fields only if users actively request, via a BBT postscript, mapping from whatever name BBT is using for these internally (whether GUI field names or CSL variable names) to whatever users need in the context of the particular bib(la)tex package they are using?

That's mostly the current state of things. The build in #1799 reports Zotero fields i the QR that seem to be unused. A postscript can trivially add them. Let's say the QR outputs

% ? Unused url: https://www.nature.com/articles/sdata201618
% ? Unused rights: 2016 The Author(s)
% ? Unused libraryCatalog: www.nature.com

then a postscript could do

if (Translator.BetterBibTeX) {
  reference.add({name: 'somefunkyfield', value: item.libraryCatalog})
}