briatte / epsaconf

Data from EPSA conferences, 2019-2023
https://netconf-geoscimo.univ-tlse2.fr/project/
2 stars 0 forks source link

Identifying abstract presenters #9

Closed briatte closed 1 year ago

briatte commented 1 year ago

Right now, role = "p" is assigned to all abstract authors, but the programmes also list abstract presenters, which might be useful to produce better guesses of who actually attended each panel.

The names of abstract presenters are listed (as comma-separated values) in abstract_presenters, so finding if an abstract author was also listed as an abstract presenter is straightforward:

# identify authors listed as actual presenters
read_tsv("data/epsa-program.tsv") %>% 
  mutate(abstract_presenter = str_detect(abstract_presenters, full_name))

However, the author variable has been lightly edited in several cases, in the conference year repositories, to correct typos and de-duplicate some individuals. The list of fixes can be found on the wiki, and fixes to names are copied below.

There are not too many fixes, but there are quite a few, and they can derail the str_detect method offered above.

The best strategy would be to match the two variables before import, in each of the conference year repositories, but this might be difficult:

List of initial (pre-import) fixes to names

(Warning, links below sensitive to script renaming/renumbering...)

The fixes might affect matching author and abstract_presenters.

briatte commented 1 year ago

Closing, presenter now holds the required information.