UW-GAC / GENESIS

GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
https://bioconductor.org/packages/GENESIS
34 stars 13 forks source link

assocTestSingle exhausts iterators #69

Closed kaskarn closed 3 years ago

kaskarn commented 3 years ago

Block iterator methods in SeqVarTools and elsewhere tend to require a user-written loop to iterate through piecemeal subsets of data. The behavior of assocTestSingle methods conflicts a bit with this expectation, since its inner loop isn't escaped until the SeqVarIterator method returns FALSE, fully consuming the iterator.

While this behavior is efficient, and desirable in many settings, it could be worth adding the option to escape the while loop to allow users to interact with individual iterations, and prevent side effects. This could be as simple as an optional argument outfun=force to be called on res[[i]] after each iteration (like outfun=return, some fwrite closure, and so on)

Feel free to close if this contradicts the package's intended design!

smgogarten commented 3 years ago

The implementation of iterators changed in GENESIS v2.23.5, with iteration now being handled by the BiocParallel package rather than a while loop.

If you want to run assocTestSIngle on a specific subset of variants, I recommend setting a filter (with seqSetFilter) on the SeqVarData object before creating the iterator. You can also find the exact indices of variants in each iteration of a SeqVarIterator object with variantFilter, which returns a list of vectors. For example, variantFilter(x)[[3]] will give you the indices of all variants in the 3rd iteration.