dwinter / mmod

Differentiation statistics in R
Other
11 stars 5 forks source link

dist.codom error #8

Open LinaValencia85 opened 6 years ago

LinaValencia85 commented 6 years ago

I am using the dist.codom function, but I am getting the following error:

Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

I was able to succesfully upload my SNP dataset with the read.structure() and have been able to run succesfully other function within the package but not dist.codom. Any ideas as to what is happening?

dwinter commented 6 years ago

Hi @LinaValencia85 ,

You are getting this error because every individual in your dataset has at least some missing data.

It's not clear to me how missing data should be handled when calculating a distance matrix. MMOD was written way back when 30 microsats was a bit dataset and missing data was uncommon, so I just removed all individuals that had any missing data.

You can get a matrix similar to the one returned by dist.codom easily enough, but doing so llustrates the problem.

M <- dist(tab(d)/ploidy(d), "manhattan")
plot(hclust(M))

When you look at the results you will see they are clustered by missingness much more than shared alleles. If this is a radseq or similar dataset, so the missingness is an unavoidable part of the data, I would look into genotype likelihood approaches or other methods designed to deal with these sorts of data in particular.

Sorry I can't be much more help.

LinaValencia85 commented 6 years ago

Thanks a lot for your help!

I found the R package popgenreport and it seems with the gd.kosman() I am able to calculate it.

Best,

Lina

Lina M Valencia PhD candidate, Anthropology University of Texas at Austin P http://www.utexas.edu/cola/depts/anthropology/faculty/ad26693#primate-molecular-ecology-and-evolution-labrimate Molecular and Evolution Lab http://conservaciontitigris.org/ http://conservaciontitigris.org/ http://www.utexas.edu/cola/anthropology/graduate/profile.php?id=lmv498 http://www.utexas.edu/cola/anthropology/graduate/profile.php?id=lmv498

On Thu, Dec 7, 2017 at 9:33 PM, David Winter notifications@github.com wrote:

Hi Lina,

You are getting this error because every individual in your dataset has at least some missing data.

It's not clear to me how missing data should be handled when calculating a distance matrix. MMOD was written way back when 30 microsats was a bit dataset and missing data was uncommon, so I just removed all individuals that had any missing data.

You can get a matrix similar to the one returned by dist.codom easily enough, but doing so llustrates the problem.

M <- dist(tab(d)/ploidy(d), "manhattan") plot(hclust(M))

When you look at the results you will see they are clustered by missingness much more than shared alleles. If this is a radseq or similar dataset, so the missingness is an unavoidable part of the data, I would look into genotype likelihood approaches or other methods designed to deal with these sorts of data in particular.

Sorry I can't be much more help.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dwinter/mmod/issues/8#issuecomment-350163130, or mute the thread https://github.com/notifications/unsubscribe-auth/AS3PcdjdEG7Ox2OnR9q4HJfqVjE5E0xSks5s-K4TgaJpZM4Q6gTs .

zkamvar commented 6 years ago

Hi @LinaValencia85,

You might want to check that you don't have incomparable samples in your data set (e.g. samples that have missing data at opposing loci). My package (poppr) has such a function: http://grunwaldlab.github.io/poppr/reference/incomp.html

Additionally, poppr also has this distance measure 😁: (http://grunwaldlab.github.io/poppr/reference/diss.dist.html)

(Sorry @dwinter for spamming your issues)

LinaValencia85 commented 6 years ago

Thanks for that! I have also used poppr individual distances to compare.

Lina M Valencia PhD candidate, Anthropology University of Texas at Austin P http://www.utexas.edu/cola/depts/anthropology/faculty/ad26693#primate-molecular-ecology-and-evolution-labrimate Molecular and Evolution Lab http://conservaciontitigris.org/ http://conservaciontitigris.org/ http://www.utexas.edu/cola/anthropology/graduate/profile.php?id=lmv498 http://www.utexas.edu/cola/anthropology/graduate/profile.php?id=lmv498

On Mon, Dec 11, 2017 at 12:03 PM, Zhian N. Kamvar notifications@github.com wrote:

Hi @LinaValencia85 https://github.com/linavalencia85,

You might want to check that you don't have incomparable samples in your data set (e.g. samples that have missing data at opposing loci). My package (poppr https://cran.r-project.org/package=poppr) has such a function: http://grunwaldlab.github.io/poppr/reference/incomp.html

Additionally, poppr also has this distance measure 😁: ( http://grunwaldlab.github.io/poppr/reference/diss.dist.html)

(Sorry @dwinter https://github.com/dwinter for spamming your issues)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dwinter/mmod/issues/8#issuecomment-350806625, or mute the thread https://github.com/notifications/unsubscribe-auth/AS3PcQF5RDDiGyUWnPVNXVjk4qQ8xFvMks5s_W58gaJpZM4Q6gTs .