JiscaH / sequoia

R package for pedigree inference based on SNP data
25 stars 6 forks source link

DuplicateCheck #22

Closed kiran-lee closed 3 weeks ago

kiran-lee commented 3 weeks ago

Hi,

I am trying to use the DuplicateCheck function to identify duplicates in 617 SNPs from 1880 bird samples. This is because it suggests there are 30 likely duplicates found during Duplicate Check of sequoia(). ?DuplicateCheck returns the documentation for the function, but when using the function, I receive the following: Error in DuplicateCheck(GenoM = sw_GenoM_family, FortPARAM.dup) : could not find function "DuplicateCheck"

DuplicateCheck did not work for the following combinations: R version 4.2.1 and Sequoia version 2.5.6

R version 4.0.0 and Sequoia version 2.11.2

Kiran

JiscaH commented 3 weeks ago

Hello Kiran, Thank you for your message! DuplicateCheck() is an internal function, which sequoia() calls, after ensuring the input is in the correct format etc. So it won't return anything different from running sequoia().

When sequoia() gives this message, the output list will contain an element called 'DupGenotype' which contains the 30 possible duplicates. Depending on your parameter settings, these may include false positives. With some luck, you see a clear split between pairs which mismatch at none or a few SNPs (duplicates) and pairs which mismatch at many more SNPs (close relatives). If not, info from the lab and field is needed to decide which ones are most likely duplicates and which ones are not.

Unfortunately CalcPairLL() is not very helpful in this case, as it does not (yet) include the possibility that the samples are duplicates, only that they are from different kinds of relatives.

To avoid any further such confusion - where did you came across this function name? Was it in the helpfile of CheckLifeHist(), or elsewhere?

Best of luck, and please let me know if you have any more questions!

kiran-lee commented 3 weeks ago

Hi Jisca,

Thank you for your detailed response! I found the 'DupGenotype' element.

I encountered the function here: https://rdrr.io/cran/sequoia/man/DuplicateCheck.html

Kiran