Closed RichardMN closed 3 years ago
Nice @RichardMN - this looks good to me and all seems sensible. For use downstream etc it might be cool to provide a citation/source method. (so DataClass$citation()
and or DataClass$source()
) which like the base R variant returns the nicely formatted version of this info ready to be used in a plot, etc. That might quite a bit more work (mostly thinking) so perhaps park in another issue if not keen to deal with here.
I've pulled in an interim fix to #389 here as well as completing my first pass of providing source texts and urls for the country regional data classes. I would welcome anyone familiar with these datasets to provide a sanity check (and corrections where necessary) on these.
@Bisaloo - I cannot quite figure out where the French data comes from. We have pointers to CSV files and I cannot line them up with what is available on the data.gouv.fr website. I've not looked that deeply, but you may already know what it would take me half an hour of spelunking to find out.
I plan to try to do the other sources (ECDC, JRC, JHU, WHO) next. The citation()
and source()
functions are a good idea but may take a bit more thinking on how to apply them.
Question: should the field names be changed to something like credit_text
and credit_url
instead of source_*
?
Totally agree that the helper functions are for another PR. For the field name I prefer source vs credit but no strong opinion.
I've now added source fields for the remaining data classes.
Question: Should we document that the ECDC data source terminated in late December?
For easier review of the list, below is a table of the current contents of all_country_data
.
It seems that the workflow actions are no longer triggered or run on this PR? Or, at least, I cannot trigger them myself.
It's passed CMD-check on my system (macOS, R 4.1.0) at home but I can't test the other build environments. Or I could try running it on my branch but I don't know the results would show here.
@seabbs or @joseph-palmer you might be able to get them to run?
Merging and will fix quickly on master.
This is an approach to adding a source text and source url field to each of our regional data sources. (Once I've done all the regional data sources I may go back and do the other sources as well.) This would be a fix for #375
I'm leaving this as a draft for now because it's incomplete but would welcome advice (@joseph-palmer ?) on this approach even before I've done it for all the datasets.
My goal is that a user could get the
source_text
orsource_url
programmatically so that if they are using data prepared by someone it is straightforward to give credit (and possibly a link). This could even be done in markdown with something likepaste0("[", data$source_text, "](", data_source_url ")")
(and some wit to handle the case where there may not be an url. In general I think we should always be able to provide asource_text
, usually be able to give an url. (Looking at the first few, it's clear that some of these datasets don't appear to have straightforward "entry points" but we can provide something people can look at (or click on) without having to pick through our code.