gaynorr / AlphaSimR

R package for breeding program simulations
https://gaynorr.github.io/AlphaSimR/
Other
43 stars 18 forks source link

Change "allele" parameter in editGenome() to "genotype"? #65

Closed janaobsteter closed 2 years ago

janaobsteter commented 2 years ago

While developing the extension for bees, we wanted to change the genotype at a specific locus, but to a particular heterozygous state. For this, we have written our own function, but it would be much better if this was allowed in AlphaSimR. This would be solved if the editGenome() would take genotype instead of allele argument. However, it would also be good if the genotype argument could be a list - one genotype for each individual in the population.

gregorgorjanc commented 2 years ago

@janaobsteter can you initiate a PR for this against the most recent AlphaSimR?

janaobsteter commented 2 years ago

Do I put our function in it?

gregorgorjanc commented 2 years ago

Yes! Do make sure it’s general;)))

gaynorr commented 2 years ago

I'd be glad to include your code in the next version. A small word of warning, you'll have to consider phase and not just genotype if you want something complete. You also need to be able to do this for autopolyploids. I've found these details make it tricky to write something that is intuitive.

If you are only interested in modifying the population at the beginning of the simulation, it might be best to unpack a founder population, make changes, and then package it back up. The best way to do this currently is shown below. I could make this simpler and more intuitive by using the importHaplo function instead. This function uses data.frames instead of lists. However, I need to add a function to convert the genetic map in list format to a genetic map in data.frame format. This functionality currently exists for SNPs and QTL, but not the whole genome. Let me know if you think this would be helpful.

library(AlphaSimR)

# Create a new founder population
founderPop = quickHaplo(nInd=5, nChr=3, segSites=c(2,3,5))

# Extract haplotype data to a list
haplotypes = vector("list", founderPop@nChr)
for(i in 1:founderPop@nChr){
  haplotypes[[i]] = pullSegSiteHaplo(founderPop, 
                                     chr=i, 
                                     asRaw=TRUE) # Not required, only reduces memory usage
}

## Make your changes to the haplotype data ##

# Extract genetic map (in list format)
genMap = founderPop@genMap

# Manually create MapPop
founderPop2 = newMapPop(genMap=genMap, haplotypes=haplotypes)

# Check the haplotypes are equal (assuming no changes)
all.equal(pullSegSiteHaplo(founderPop), pullSegSiteHaplo(founderPop2))
gregorgorjanc commented 2 years ago

Chris,

Thanks!

For completeness reasons I do think it would be good to have the importHaplo work with the whole genome so that one could bring in whole-genome sequence haplotypes from any other source that quickHaplo or runMacs.

Let us discuss how we want to proceed with our csd "editing" plans. However, expanding the edit function to work with the genotype and not just allele at a single locus is straightforward and would make the function more relevant.

@Jana can you create a PR with your code change and let's see where this takes us.

With regards, Gregor

On Mon, Jul 25, 2022 at 23:15:12, Chris Gaynor @.***> wrote:

I'd be glad to include your code in the next version. A small word of warning, you'll have to consider phase and not just genotype if you want something complete. You also need to be able to do this for autopolyploids. I've found these details make it tricky to write something that is intuitive.

If you are only interested in modifying the population at the beginning of the simulation, it might be best to unpack a founder population, make changes, and then package it back up. The best way to do this currently is shown below. I could make this simpler and more intuitive by using the importHaplo function instead. This function uses data.frames instead of lists. However, I need to add a function to convert the genetic map in list format to a genetic map in data.frame format. This functionality currently exists for SNPs and QTL, but not the whole genome. Let me know if you think this would be helpful.

library(AlphaSimR)

Create a new founder population

founderPop = quickHaplo(nInd=5, nChr=3, segSites=c(2,3,5))

Extract haplotype data to a list

haplotypes = vector("list", @.) for(i in @.){ haplotypes[[i]] = pullSegSiteHaplo(founderPop, chr=i, asRaw=TRUE) # Not required, only reduces memory usage }

Make your changes to the haplotype data

Extract genetic map (in list format)

genMap = @.***

Manually create MapPop

founderPop2 = newMapPop(genMap=genMap, haplotypes=haplotypes)

Check the haplotypes are equal (assuming no changes)

all.equal(pullSegSiteHaplo(founderPop), pullSegSiteHaplo(founderPop2))

— Reply to this email directly, view it on GitHub https://github.com/gaynorr/AlphaSimR/issues/65#issuecomment-1194699154, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFRGSXCPAHLK34MS6Z7LXLVV4GXBANCNFSM54RFT5YA . You are receiving this because you commented.Message ID: @.***>

janaobsteter commented 2 years ago

Chris, thanks for this! I don't have a problem with having lists, but I can see having data frame might be easier for other users (especially if they are piping the haplotypes in from somewhere else). What I wrote for our needs was a function that edits the genome at a single diploid site in a population, so does not cater for polyploidy. I can create a PR and we can upgrade it together to a more general function.