Closed bastistician closed 4 years ago
Oh I see. I think I misunderstood you earlier regarding the description. Right now the title is pulled from the Bibtex entry on the Cran citation file. What you are talking about is following the Github URL and reading from the DESCRIPTION file found in the package contents. Correct?
So for tidyverse we would be reading this: https://github.com/tidyverse/tidyverse/blob/master/DESCRIPTION to get tidyverse: Easily Install and Load the 'Tidyverse'
The problem is that you cannot reliably pull the name and title of a CRAN package from its citation page. But you can easily pull these metadata from the package's DESCRIPTION file, which is stored on CRAN, for example at https://CRAN.R-project.org/web/packages/forecast/DESCRIPTION for the forecast package. So no need to resort to a GitHub repo here.
Hi @bastistician. I made some updates today so the software is pulling CRAN package and title from the DESCRIPTION file.
One side effect is authors are being pulled from the DESCRIPTION file as well. It is a bit more complex to pull the authors from a separate source but it can be done. Please take a look and let me know what you think.
Your updates seem to fix the issue reported here. However, as you say, the solution is far from ideal because the generated citation is no longer appropriate in most cases:
If a package on CRAN has a CITATION file, citeas should really use that information to generate the desired citation. It seems that CITATION files are now ignored.
As a final resort, a citation can be generated from the DESCRIPTION file similar to what citation(package, auto = TRUE)
returns in R. This should only consider full authors ("aut" role), not contributors ("ctb") or other parties.
BTW, for non-R parsers of R package DESCRIPTION files, it might be easier to extract the author list from the plain-text Author
field rather than from the Authors@R
field:
If there is an Authors@R
field in the DESCRIPTION (true for most R packages nowadays), the additional Author
field is an auto-generated, comma-separated list of authors in the form "firstname lastname [comma-separated roles] (optional comment)".
If the package authors did not use the Authors@R
mechanism, the Author
field contains whatever the package authors have specified there, hopefully a comma-separated list of author names as recommended... For example, the popular spatstat package does not have an Authors@R
field in its DESCRIPTION (but note that spatstat has a CITATION file from which citeas should extract the citation).
Ok I will pull some of the other fields such as authors from the citation file. That's one way the software needs to be improved, in that it is sometimes better to combine sources rather than stop at the first valid source found.
Hello. Want to let you know I reverted the software to primarily pull from the citation. Mixing the CITATION and DESCRIPTION data is a good idea, but it requires some larger changes to the code base and there are some smaller things to need to be fixed first. I'll try to revisit this in a couple months.
The "name" returned by the citeas API depends on the steps taken to find a proper citation. This means, the name is sometimes derived from the title of a publication (BibTeX reference), sometimes from the DESCRIPTION file as "Package: Title" (and there are certainly other routes as well). I suggest the "name" for CRAN packages to always be of the latter form similar to what the CRAN package web page uses as its heading.
Always returning a name of the form "Package: Title" would be much more consistent and also more reliable. For example, the name returned for knitr and forecast is "R" instead of "knitr: A General-Purpose Package for Dynamic Report Generation in R" and "forecast: Forecasting Functions for Time Series and Linear Models", respectively. Such issues could probably be fixed in the BibTeX parser. However, the title of the publication will almost always be different from the title which the authors have chosen for their package as given in the DESCRIPTION, which IMHO is the one to use as the "name".