Closed bodkan closed 1 year ago
Merging #123 (aa097e7) into main (e73fd18) will increase coverage by
0.31%
. The diff coverage is98.43%
.
:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more
@@ Coverage Diff @@
## main #123 +/- ##
==========================================
+ Coverage 83.37% 83.69% +0.31%
==========================================
Files 6 6
Lines 2996 3060 +64
==========================================
+ Hits 2498 2561 +63
- Misses 498 499 +1
Impacted Files | Coverage Δ | |
---|---|---|
R/tree-sequences.R | 87.83% <98.43%> (+0.64%) |
:arrow_up: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
IBD tracts collected from spatial tree sequences are now annotated with spatial coordinates of nodes and returned as spatial sf objects by default.
As an aside, note that although ts_ibd()
returns IBD data in a tabular format as mentioned in the first post, and doesn't work with iteration (and never will), if users need to do iterate over massive amounts of IBD, they can always use the reticulate-d iteration in R just like is shown in tskit docs for Python. (Honestly though, at that point it's probably better to use Python.)
This adds a new function -- slendr's interface to
TreeSequence.ibd_segments()
.As explained in the
?ts_ibd
manpage, this is not a real wrapper. R handles heavy iteration extremely poorly so the documented use cases wouldn't really work here. Certainly not for large tree sequences.Instead,
ts_ibd()
collects all requested IBD data (either all individual IBD segments whencoordinates = TRUE
or counts and total pairwise IBD amount whencoordinates = FALSE
, which is the default) and returns the results as a plain data frame (EDIT: for spatial tree sequences the returned IBD table is now fully spatially annotated and is of the sf data type).To help to make things manageable, pruning the IBDs to be returned either by setting the minimum length of an IBD segment to be considered, or via setting the maximum age of an ancestor of an IBD pair, is still supported. In fact, given how easy it is to choke on too much IBD,
ts_ibd()
writes a warning message if all possible IBDs are being requested by the user (something that is most likely an oversight during normal data analysis).Similarly, the
within =
andbetween =
arguments are also supported. In line with the rest of the slendrts_*()
library, these arguments accept symbolic names of individuals, not just integer IDs of nodes.