The given initial N is parsed as `FALSE`: bibliography_entries()

smped commented 3 years ago

Hi there,

Thanks for making this package & templates available. I've found a glitch unfortunately.

When parsing a bibliography using bibliography_entries() the function calls bib <- yaml::yaml.load(rmarkdown::pandoc_citeproc_convert(file, "yaml"))$references

The parsing with rmarkdown::pandoc_citeproc_convert() is fine, but if the author's given name is simply the initial N, this is parsed by yaml::yaml.load() as FALSE instead of the correct single character.

An example is given below

c("---", "nocite: \"[@*]\"", "references:", "- author:", "  - family: McInnes", 
"    given: N", "  - family: Barry", "    given: SC", "  container-title: Oncogene", 
"  id: mcinnes2012foxp3", "  issue: 8", "  issued: 2012", "  page: 1045-1054", 
"  publisher: Nature Publishing Group", "  title: FOXP3 and FOXP3-regulated microRNAs suppress SATB1 in breast", 
"    cancer cells", "  type: article-journal", "  volume: 31", 
"---", "")
yaml::yaml.load(y)$references[[1]]$author

I couldn't find a solution directly using the arguments to yaml.load(), but wondered if a more experienced user could. Or alternatively, there may need to be an error correction step added for this edge case

Thanks in advance

mitchelloharawild commented 3 years ago

MRE:

library(vitae)
#> 
#> Attaching package: 'vitae'
#> The following object is masked from 'package:stats':
#> 
#>     filter
y <- c("---", "nocite: \"[@*]\"", "references:", "- author:", "  - family: McInnes", 
       "    given: N", "  - family: Barry", "    given: SC", "  container-title: Oncogene", 
       "  id: mcinnes2012foxp3", "  issue: 8", "  issued: 2012", "  page: 1045-1054", 
       "  publisher: Nature Publishing Group", "  title: FOXP3 and FOXP3-regulated microRNAs suppress SATB1 in breast", 
       "    cancer cells", "  type: article-journal", "  volume: 31", 
       "---", "")
writeLines(y, x <- tempfile())
bibliography_entries(x)

/tmp/Rtmpto8tDY/file148d1473b3f9.yaml

^{Created on 2021-01-23 by the reprex package (v0.3.0)}

mitchelloharawild commented 3 years ago

The solution here is to avoid using yaml.load() to read in the YAML structured bibliography, as the structure does not respect yaml parsing of types.

Instead, the bibliography should be read in directly as a list type and have types support the full csl-json schema: https://github.com/citation-style-language/schema/blob/master/schemas/input/csl-data.json

To do this properly is unfortunately a lot of work, but it is something that I have begun working on. The amount of work may warrant splitting into a separate package for CSL-JSON reference management (much like RefManageR).

smped commented 3 years ago

Thanks so much for chasing this up & sorry it's turned out to be a bit of a curly one. I was imagining an lapply through the parsed data looking for any FALSE values & converting them back to N. Of course, that may have unintended consequences too. Parsing correctly will always be a better solution.

For my own CV, I ended up using scholar and writing my own print function for highlighting my name in the author list. Bit clunky, but it got the job done.

mitchelloharawild commented 3 years ago

Looking through for logical values and reverting it has some issues. There are multiple YAML values which map to FALSE, such as n, no, false, etc. The same can be said for TRUE.

Glad you found a solution that worked :) I think name highlighting would be a widely appreciated feature for bibliography_entries(), but I'm not sure how this can be done nicely.

mitchelloharawild commented 3 years ago

This should be fixed now, the new code may later be broken into a separate package as it achieves a lot of what is needed for https://github.com/ropensci/RefManageR/issues/72.

Thanks for your bug report!

mitchelloharawild / vitae

The given initial N is parsed as `FALSE`: bibliography_entries() #152