EuracBiomedicalResearch / FamAgg

This is the development version of the FamAgg Bioconductor package.
https://EuracBiomedicalResearch.github.io/FamAgg
MIT License
0 stars 2 forks source link

Get pairs of individuals with a kinship larger or smaller than a certain cut-off #24

Closed jorainer closed 3 years ago

jorainer commented 3 years ago

For a familial resemblance analysis as defined in chapter 6 of this book we would need pairs of individuals with a kinship higher and lower than a certain threshold.

I would thus suggest a function kinshipPairs defined as follows:

kinshipPairs <- function(x, condition = function(x) x > 0.25)

This would allow to calculate correlation coefficients between the pairs and then to evaluate whether these correlations are higher between relatives compared to unrelated individuals.

jorainer commented 3 years ago

happy for feedback on that @the-x-at

the-x-at commented 3 years ago

Sounds pretty straight-forward. One has to check the upper diagonal of the kinship matrix for condition, which avoids reporting duplicates. Maybe add an optional logical argument diag = FALSE to exclude/include the diagonal, which trivially is the kinship of the individual with itself. I think R is also pretty good in float comparisons, such that even x >= 0.25 should work, no?

Would it be desirable to provide a subset of the kinship matrix by offering id = NULL as an optional argument for subsetting by IDs, since this is supported many times in the interface? The same for one or even more families, i.e. family = NULL?

So the function would be a member of FAData and look like this:

    kinshipPairs <- function(condition, id = NULL, family = NULL, diag = FALSE)

I would define condition as a mandatory argument, underlining its importance. I am not sure what is the best way to define the signature of this function, as it operates on a kinship values, i.e. a numeric data type and report a data frame.

jorainer commented 3 years ago

upper.tri is actually a very good idea! That way we also avoid the diagonal and hence pairs of individuals with itself.

jorainer commented 3 years ago

For the id, do you think that is really necessary?

the-x-at commented 3 years ago

If we provide family, we also should provide id. Have a look at the other kinship-based functions. They have exactly this type of interface.

jorainer commented 3 years ago

So, the family and id parameters would allow to restrict the calculation of the pairs on certain individuals, right?

the-x-at commented 3 years ago

Esatto. I see them in mutually exclusive use. No reason to nail down a family and then restrict it by some IDs.