Closed sangeetabhatia03 closed 1 month ago
Preliminary function which obviously needs some work but might be useful as a starting point for folks
get_SI <- function(df) {
if(length(df$parameter_type) == 0) {
stop("df should be the parameter dataframe outputted from load_epidata()")
} else {
index <- which(df$parameter_type == "Human delay - serial interval")
si <- df[index,]
}
return(si)
}
Working on this now
Notes from LSTHM meeting related to this issue. Potentially we may want to split this into several issues
And the second slide:
Working on this now on the branch feature-filter-data How far did you get with this @sangeetabhatia03 ?
This is related to #59
@ruthmccabe and I are now working on writing a get_serial_interval()
function and other get_*()
functions. This will wrap get_parameter()
and the new get_key_columns()
function that has been added on the get_key_cols
branch.
So these functions will subset both the rows (using get_parameter()
) and the columns (using get_key_columns()
).
The new function we are writing will be pushed to the get_specific_params
branch which branches from the HEAD of get_key_cols
. We will likely make these changes as separate pull requests so it's easier to review the new additions.
@sangeetabhatia03 and @ginacuomo please let us know what you think about these developments, we can tag you in the pull requests once they're open.
Sounds good to me @joshwlambert and @ruthmccabe
Sounds great to me @joshwlambert @ruthmccabe -- thank you both, this has been on my list for ages. happy to contribute however helpful especially once I'm back w/c 17 June
@joshwlambert I've pushed what I've done so far and wondering if you can please take a look at some point?
@ruthmccabe I've had a look and everything looks good. Going back to Sangeeta's point in the first message of this issue about whether these functions should read directly from the package data or whether the user should have to call load_epidata()
first, this might be a good opportunity to write the functions so they can do both. Let me know if this would be beneficial and I can sketch out the most efficient way to implement this.
@sangeetabhatia03 the get_specific_params
branches from get_key_cols
(which is open in PR #97). Therefore, it may be best to merge #97 first and then rebase the get_specific_params
branch. Merge commits may also works, but just wanted to raise this to try and prevent any conflicts when merging branches.
Thanks @joshwlambert! This is a great question and I am happy with either way to be honest. Please let me know your thoughts on this Josh and @sangeetabhatia03 then I am happy to revisit
I don't mind which implementation is chosen and will let @sangeetabhatia03 decide as she opened the issue.
Given these functions are good to go it might be worth opening a PR to merge these new function into main
to make them more available to package users. Then we can consider enhancing them in future with the ability to read directly from the data tables rather than requiring the user to load them manually.
Totally agree - let me raise a PR and then can have this as a new issue. Also because I can add the doi/article_info update that is now in main to the key cols which is also on my list of things to do
we want user-friendly functions to get to any parameter of interest. e.g. serial_intervals('ebola') etc. We could either have
Or,
The second option will use
load_epidata
. Both of these are quite easy to implement technically, needs more thought about how to make it user-friendly. Once we have these in place, forest_plot functions should be modified to use these instead of doing their own filtering.