Right now, role = "p" is assigned to all abstract authors, but the programmes also list abstract presenters, which might be useful to produce better guesses of who actually attended each panel.
The names of abstract presenters are listed (as comma-separated values) in abstract_presenters, so finding if an abstract author was also listed as an abstract presenter is straightforward:
# identify authors listed as actual presenters
read_tsv("data/epsa-program.tsv") %>%
mutate(abstract_presenter = str_detect(abstract_presenters, full_name))
However, the author variable has been lightly edited in several cases, in the conference year repositories, to correct typos and de-duplicate some individuals. The list of fixes can be found on the wiki, and fixes to names are copied below.
There are not too many fixes, but there are quite a few, and they can derail the str_detect method offered above.
The best strategy would be to match the two variables before import, in each of the conference year repositories, but this might be difficult:
In 2019 and in 2023, we collect abstract_authors and abstract_presenters but then apparently ignore the first variable in favour of getting authors from participant lists instead. It might not be obvious how to identify presenters in these cases.
Right now,
role = "p"
is assigned to all abstract authors, but the programmes also list abstract presenters, which might be useful to produce better guesses of who actually attended each panel.The names of abstract presenters are listed (as comma-separated values) in
abstract_presenters
, so finding if an abstract author was also listed as an abstract presenter is straightforward:However, the
author
variable has been lightly edited in several cases, in the conference year repositories, to correct typos and de-duplicate some individuals. The list of fixes can be found on the wiki, and fixes to names are copied below.There are not too many fixes, but there are quite a few, and they can derail the
str_detect
method offered above.The best strategy would be to match the two variables before import, in each of the conference year repositories, but this might be difficult:
abstract_authors
andabstract_presenters
but then apparently ignore the first variable in favour of getting authors from participant lists instead. It might not be obvious how to identify presenters in these cases.List of initial (pre-import) fixes to names
(Warning, links below sensitive to script renaming/renumbering...)
The fixes might affect matching
author
andabstract_presenters
.