Closed huangl-CAU closed 2 years ago
Hi Huang,
Thank you for your interest in dNdScv and please accept my sincere apologies for the very late response.
dNdScv was not designed for the analysis of germline mutations from a highly-recombining population so you will need to be careful and critical with the results and interpretation. In theory, some of the assumptions in dN/dS can be violated when working with polymorphism data (see this paper). However, I believe that the loss of monotonicity between dN/dS ratios and selection coefficients in that paper is expected only under extreme (biologically implausible) levels of recombination, where adjacent synonymous and non-synonymous sites segregate independently (free recombination). Whereas I think that monotonicity will not be a problem in your data (i.e. dN/dS<<1 in your data should be the result of negative selection and dN/dS>>1 the result of positive selection), you need to be careful not to assume that those dN/dS ratios directly enable you to estimate selection coefficients (they don't).
For the analyses that you suggest, I think that dNdScv could work reasonably well. You need to be careful to input unique mutations into dNdScv and avoid counting the same SNP multiple times as independent mutations. To address your other questions:
I hope this helps.
Best, Inigo
Dear Authors, When Apply dndscv to a Non-cancer population with high recombination , I have met several questiones .And I wonder if you could give me some suggestions
Purpose & question
I am re-sequencing a plant inbred line population with thousands of samples,the sequence depth is about 35x, and try to
My question are these
Pretreatment
I tried this analysis using a pipeline like that
Dndscv command & result
Then I run the dndscv with the follow command
dndsout = dndscv(mutations, refdb=mydb, max_muts_per_gene_per_sample = Inf, max_coding_muts_per_sample = Inf,outmats=T, cv=NULL)
with most genes(33000/37000) are under purify seletion, filtered by "qallsubs_cv<0.05 & wmis_cv<1' The proportion of genes under postive seletion are 2500/37000 , filtered by "qallsubs_cv<0.05 & wmis_cv>1' The proportion of genes under Neutral selection are 2500/37000, filtered by "qallsubs_cv>0.05 'The nbreg$theta value is 0.688948905558218 The global dn/ds is The genemuts table looks like that
Other small question
1: the correlation between observed synonymous SNP num and exp-syn are modest,even for the sample use in tutorial
2: can the wmis_cv used as a measure for the degree of selective pressure 3:should I calculate one-sided q-values for negative selection and postive selection independently
Thanks a lot ! HuangL