Closed daikitag closed 11 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
0e49d80
) 100.00% compared to head (61ec820
) 100.00%.:exclamation: Current head 61ec820 differs from pull request most recent head 7d1929a. Consider uploading reports for the commit 7d1929a to get more accurate results
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I was very much overestimating the computational time that it takes to examine the allele frequency. This shows the result for a tree sequence with 1 million samples, 1Mb, 1000 causal sites.
Allele frequency:
Non allele frequency (alpha=0):
It seems that obtaining the genotype was the bottle neck of the sim_phenotype
algorithm, and not obtaining the allele frequency
Since the computational time is very fast, I'm thinking about adding allele frequency column to sim_trait
function, as it is widly used in simulation-based studies, and obtaining it from tskit
is not very clear. Do you have any suggestions on this @jeromekelleher ?
Added allele frequency output here (https://github.com/daikitag/tstrait/pull/1). I find it to be useful to have the allele frequency column, as it will make the explanation of the frequency dependence model in the documentation easy to understand, and allele frequency is typically part of the standard packages to simulate quantitative traits.
Draft of the documentation of sim_trait
is added here: https://github.com/daikitag/tstrait/pull/2
Since the computational time is very fast, I'm thinking about adding allele frequency column to sim_trait function, as it is widly used in simulation-based studies, and obtaining it from tskit is not very clear. Do you have any suggestions on this @jeromekelleher ?
What's it used for other than frequency dependence via alpha @daikitag? I feel like we can add this to the data frame later if there's a feature request, but it's not something we have anyone actually asking for?
Since the computational time is very fast, I'm thinking about adding allele frequency column to sim_trait function, as it is widly used in simulation-based studies, and obtaining it from tskit is not very clear. Do you have any suggestions on this @jeromekelleher ?
What's it used for other than frequency dependence via alpha @daikitag? I feel like we can add this to the data frame later if there's a feature request, but it's not something we have anyone actually asking for?
I've seen some papers that simulate genetic values by using all causal sites and then see the influence of causal mutations depending on allele frequencies, but I agree with you that we can add the feature later. I would like to add the plot of (allele frequency vs effect size) to show the frequency dependence model in the documentation, but do you have any suggestions on how I can compute the allele frequency of causal mutations there?
OK, let's keep the allele_frequency
if you already have a use for it.
Thanks @jeromekelleher . Would it be possible for you to merge this pull request? I would like to make a new PR for allele frequency and I would also like to use this function in unit tests for genetic_value
.
https://github.com/tskit-dev/tstrait/issues/113 updated here
@jeromekelleher
I created a new pull request for the documentation of the new sim_trait
function here https://github.com/tskit-dev/tstrait/pull/115. I think I'm finished with making all the modifications to the sim_trait
function, and would it be possible for you to reivew it whenever you have some time?
Update the sim_trait function.