oscar-system / Oscar.jl

A comprehensive open source computer algebra system for computations in algebra, geometry, and number theory.
https://www.oscar-system.org
Other
339 stars 120 forks source link

bibtool and bibliography #750

Open ThomasBreuer opened 2 years ago

ThomasBreuer commented 2 years ago

(I am sorry that apparently I had not commented on the final version of #642.)

It is fine to normalize the references. However, I think it would be better to keep the information from the MathScinet version of the data in the source file. If we want to delete superfluous fields from the file that is processed by Documenter.jl, we can generate a different file with bibtool. If this solution is not acceptable then we could at least keep the mrnumber field, because this allows one to recover the MathScinet information.

Note that it might happen that forthcoming versions of Documenter.jl use more information; for example, it is reasonable to turn the mrnumber value automatically into a link to the review in question (as GAPDoc does for its bibliographies). Besides that, I find .bib files that were created from MathScinet contents useful for copying individual entries, independent of the purpose to process the Oscar documentation.

Which variant do you prefer? I can create a pull request for the changes.

thofma commented 2 years ago

The MathScinet database is not really a good example for consistency, so I usually like tweaking the items. But I have no strong feeling about this. Regarding keeping the mrnumber field, I have no objection.

ThomasBreuer commented 2 years ago

@thofma Concerning consistency, I guess that you mean the LaTeX markup in the titles and in the authors' names; for the latter, I think that switching to unicode would be possible and would yield a big improvement.

thofma commented 2 years ago

No, I meant things like Breuer, T. vs Breuer, Thomas. Does Vanilla LaTeX support unicode in source files? Recently we had to remove \operatorname, because it is not part of vanilla LaTeX.

ThomasBreuer commented 2 years ago

What I was thinking of are unicode source data which can be easily turned into LaTeX readable data if necessary but which can also be processed by other tools, for example in order to create Markdown or HTML bibliographies. For example the GAP bibliography has been created from MathScinet data by some heuristic translations from LaTeX to unicode, and the HTML version of it has been created from this source. The point is that a LaTeX document can be understood well only by LaTeX, automatically creating other formats from it is usually not easy.

thofma commented 2 years ago

Sorry, but I do not understand what the variants refer to in the original post and what the workflow is supposed to be for adding new items (which do not necessarily have a mrnumber). Can you clarify?

ThomasBreuer commented 2 years ago

The first variant would be to have a source file that contains the "full" entries, which are copied from MathScinet if possible (and perhaps corrected if necessary), and which are just entered by hand if there is no MathScinet entry (yet). Adding new items means editing this file. (I am not proposing to automatically fetch items from MathScinet.) Then bibtool is used to generate a normalized and stripped version from this source file; the output is used for processing the Oscar documentation.

The second variant would be to let bibtool rewrite the source file after entering new items, as it is currently done, but to change the configuration such that available mrnumbers are kept. This way, it is at least possible to automatically recover the "full" entries if needed.

thofma commented 2 years ago

OK, thanks for the summary. Both is fine for me. So let's hear what the others think.

fingolfin commented 2 years ago

First off: I think it's worth discussing this, but it's really a low priority, so I don't want to spend more of my time working on this than I already did; others are free to improve upon it, though.

Personally I don't see a benefit of keeping the "original" MathSciNet items, or their formatting. Nor would I spend a lot of time on automating translation of titles -- if we see a problem in a title or elsewhere, we can easily fix it manually. So far this scaled well, and it worked for me even in surveys with a 100+ reference items, so I think we probably would spend more time trying to automates this than what we need to just go in there and fix it. E.g. looking at https://oscar-system.github.io/Oscar.jl/dev/references/ I see Chapman \& Hall/CRC which is easy to fix manually, and likewise for {Toric varieties} (which also seems to miss a DOI / DOI URL.

Retaining the mrnumber is fine by me. My one concern with MathSciNet and linking to it is that it is not open and accessible to everyone, in fact most people cannot access it. If we want to do this, how about linking to https://zbmath.org instead or in addition? That said, MathSciNet seems to have the ability in many cases to link to the actual paper/book, which zbmath doesn't seem to have (or at least i couldn't find it). But then again, we usually have this ability already via the DOI, and indeed, the title of a bibitem usually is a link to the DOI URL.

So, I'd be OK with retaining the mrnumber, but then I'd also try to retain/include the Zbl (e.g. Zbl = {0922.20003} for Peter Cameron's book on "Permutation Groups"). Then someone can make a feature request or PR for https://github.com/ali-ramadhan/DocumenterCitations.jl to output something like MathSciNet review or Zentralblat review or both at the end of a given bib item, assuming it has either of these.. Or perhaps we can piggy back on that package and insert it ourselves, dunno.

fingolfin commented 2 years ago

For a quick and dirty solution to print the MathSciNet/Zentralblatt/whaetever else data in the reference, we could just do the following brute force hack: load DocumenterCitations.jl then overwrite its Selectors.runner method which currently looks like this (I think it's clear how and where to insert the code for printing the additional data):

function Selectors.runner(::Type{BibliographyBlock}, x, page, doc)
    @info "Expanding bibliography."
    raw_bib = "<dl>"
    for (id, entry) in doc.plugins[CitationBibliography].bib
        @info "Expanding bibliography entry: $id."

        # Add anchor that citations can link to from anywhere in the docs.
        Anchors.add!(doc.internal.headers, entry, entry.id, page.build)

        authors = xnames(entry) |> tex2unicode
        link = xlink(entry)
        title = xtitle(entry) |> tex2unicode
        published_in = xin(entry) |> tex2unicode

        raw_bib *= """<dt>$id</dt>
        <dd>
          <div id="$id">$authors, $(linkify(title, link)), $published_in</div>
        </dd>"""
    end
    raw_bib *= "\n</dl>"

    page.mapping[x] = Documents.RawNode(:html, raw_bib)
end

On the long run I hope they'll add support for custom bib styles, see https://github.com/ali-ramadhan/DocumenterCitations.jl/issues/22