Open bquilty25 opened 3 months ago
Thanks for raising this @bquilty25. This is indeed a nice workflow to have. I will address this issue in the coming week and point you to how to access the new feature.
Are you sue you need to call as.data.frame()
? The epichains
objects are data frames, too, so I would have thought it's possible to use tidyverse functions directly (I think).
Here's my initial assessment.
You don't need to change it to a <data.frame>
for use with tidyverse but even if you do, the attributes are retained. In that case, we just need a simple as.epichains()
method to to convert back to <epichains>
(this is in the scope of this issue).
You lose the attributes only when you invalidate the class by removing a protected column, hence, you can't print()
currently as it does not transfer to the next method. Here, we need to transfer to the next printing method (See related issues: #278 and #274).
library(epichains)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
set.seed(32)
chains_pois_offspring <- simulate_chains(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
lambda = 2
)
# Print fresh object
chains_pois_offspring
#> `<epichains>` object
#>
#> < epichains head (from first known infector) >
#>
#> chain infector infectee generation time
#> 11 1 1 2 2 3
#> 12 1 1 3 2 3
#> 13 2 1 2 2 3
#> 14 2 1 3 2 3
#> 15 3 1 2 2 3
#> 16 3 1 3 2 3
#>
#>
#> Number of chains: 10
#> Number of infectors (known): 9
#> Number of generations: 4
#> Use `as.data.frame(<object_name>)` to view the full output in the console.
# attributes of the fresh object
attributes(chains_pois_offspring)
#> $names
#> [1] "chain" "infector" "infectee" "generation" "time"
#>
#> $class
#> [1] "epichains" "data.frame"
#>
#> $row.names
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#> [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
#> [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
#> [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
#> [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
#> [91] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
#> [109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
#> [127] 127 128 129
#>
#> $n_chains
#> [1] 10
#>
#> $statistic
#> [1] "size"
#>
#> $offspring_dist
#> function (n, lambda)
#> .Call(C_rpois, n, lambda)
#> <bytecode: 0x1113f18e8>
#> <environment: namespace:stats>
#>
#> $stat_threshold
#> [1] 10
#>
#> $track_pop
#> [1] FALSE
# change it to a data.frame and inspect the attributes
chains_df <- as.data.frame(chains_pois_offspring)
attributes(chains_df)
#> $names
#> [1] "chain" "infector" "infectee" "generation" "time"
#>
#> $class
#> [1] "data.frame"
#>
#> $row.names
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#> [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
#> [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
#> [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
#> [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
#> [91] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
#> [109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
#> [127] 127 128 129
#>
#> $n_chains
#> [1] 10
#>
#> $statistic
#> [1] "size"
#>
#> $offspring_dist
#> function (n, lambda)
#> .Call(C_rpois, n, lambda)
#> <bytecode: 0x1113f18e8>
#> <environment: namespace:stats>
#>
#> $stat_threshold
#> [1] 10
#>
#> $track_pop
#> [1] FALSE
# which attributes are lost?
setdiff(names(attributes(chains_pois_offspring)), names(attributes(chains_df)))
#> character(0)
## Use original epichains object in a tidyverse pipeline and remove a protected column
obj_modified <- chains_pois_offspring |>
mutate(isolated = ifelse(generation > 2, TRUE, FALSE)) |>
select(-c(generation, infector))
# what is the class?
class(obj_modified)
#> [1] "epichains" "data.frame"
# Which attributes are lost?
setdiff(names(attributes(chains_pois_offspring)), names(attributes(obj_modified)))
#> [1] "n_chains" "statistic" "offspring_dist" "stat_threshold"
#> [5] "track_pop"
# Can we print the modified object?
try(obj_modified)
#> Error in .validate_epichains(x): object does not contain the correct columns
Created on 2024-09-03 with reprex v2.1.1
I'm doing an analysis where I use
simulate_chains()
to generate an epidemic, which is then "pruned" afterwards to simulate the effect of testing and isolation.This involves using
as.data.frame()
on the resultingepichains
object, applying a set of functions, and then summarising the results using the tidyverse. However it would be nice if there were the ability to convert thedata.frame
back into anepichains
object, so that it can be summarised using functions within the package as well as be interoperable withepicontacts
for plotting, if possible.