magnusdv / pedtools

Tools for working with pedigrees in R
GNU General Public License v3.0
23 stars 3 forks source link

`transferMarkers()` does not work as expected if `from` is a list #51

Closed bragef closed 6 months ago

bragef commented 6 months ago

If the from argument given to transferMarkers is a list, the copied genotypes in the output does not match the correct genotype in the output.

In the example below, "B" and "C" are both given the genotype from individual "A":

library(pedtools)

x1 <- singleton("A") |> addMarker("1/2", name = "M1")
x2 <- singleton("B") |> addMarker("3/4", name = "M1")
x3 <- singleton("C") |> addMarker("5/6", name = "M1")

l1 <- list(x1, x2, x3)
print(l1)
#> [[1]]
#>  id fid mid sex  M1
#>   A   *   *   1 1/2
#> 
#> [[2]]
#>  id fid mid sex  M1
#>   B   *   *   1 3/4
#> 
#> [[3]]
#>  id fid mid sex  M1
#>   C   *   *   1 5/6

x4 <- nuclearPed(father = "B", mother = "C", children = "D") 

print(transferMarkers(from=l1, to=x4))
#>  id fid mid sex  M1
#>   B   *   *   1 1/2
#>   C   *   *   2 1/2
#>   D   B   C   1 -/-

Created on 2024-02-29 with reprex v2.1.0

bragef commented 6 months ago

Looking at the toy example, I realise now that the problem is that I did specify the alleles to the addMarker() call.

It was rather less obvious in my real use case. I created a pedigree-list created by a single call to as.ped() on a data frame, which I later used to copy the genotypes into the pedigree objects. It now works as expected after I applied setFreqDatabase() to the list before copying.

magnusdv commented 6 months ago

Thanks. Problems like this may indeed occur when pedigrees components are defined/modified separately.

A simpler and safer way to create the toy example is:

library(pedtools)

singletons(c("A", "B", "C")) |> 
  addMarker(A = "1/2", B = "3/4", C = "5/6", alleles = 1:6, name = "M1")
#> [[1]]
#>  id fid mid sex  M1
#>   A   *   *   1 1/2
#> 
#> [[2]]
#>  id fid mid sex  M1
#>   B   *   *   1 3/4
#> 
#> [[3]]
#>  id fid mid sex  M1
#>   C   *   *   1 5/6

Anyway, I do agree that your example shows unexpected behaviour. I have added some checks to catch differences between components, so as of db072d5 your original example now gives an error by default:

library(pedtools)

x1 <- singleton("A") |> addMarker("1/2", name = "M1")
x2 <- singleton("B") |> addMarker("3/4", name = "M1")
x3 <- singleton("C") |> addMarker("5/6", name = "M1")
l1 <- list(x1, x2, x3)

x4 <- nuclearPed(father = "B", mother = "C", children = "D") 

transferMarkers(from = l1, to = x4)
#> Error: Marker attributes differ between pedigree components

Created on 2024-03-01 with reprex v2.1.0