knifecake / leapdna-website

The website and API for the leapdna project.
https://leapdna.org
2 stars 0 forks source link

NorwegianFrequencies and location of markers #3

Open thoree opened 3 years ago

thoree commented 3 years ago

Great app! Would it be possible to include the NorwegianFrequencies, the 35 markers available in R Familias and the R library forrel. It would generally be very nice if also chromosome location of markers could be available using the app, a suggestion for the the to do list!

knifecake commented 3 years ago

Including chromosome location of markers is definitely on the roadmap! As for the Norwegian frequencies available in R Familias, I'll look into it (I need to find a reference and then I can run the data through the same pipeline as the others so that the formats are consistent).

thoree commented 3 years ago

Reassuring roadmap! The source for NorwegianFrequencies is (I don't know if it helps): Dupuy et al. (2013): Frequency data for 35 autosomal STR markers in a Norwegian, an East African, an East Asian and Middle Asian population and simulation of adequate database size. Forensic Science International: Genetics Supplement Series, Volume 4 (1).

knifecake commented 3 years ago

The study by Dupuy et al. is also now part of the pipeline and will be added in the next release. A preview is available here: https://next.leapdna.org/explore/study/dupuy2013_norway. I had some trouble finding coordinates for two markers: D11S554 and D17S906. I hope to find some soon. Also, the paper reports H_obs and H_exp are available online but I've been unable to extract them from the .fam file available in familias.no (perhaps it is because I'm using forrel::readFam and this is not supported). H_exp, I can calculate myself, but I'm following the convention that if a paper reports them, I use what it says.

Chromosome location of markers will be available in the next release of leapdna which can be previewed at https://next.leapdna.org. Coordinates wrt. GRCh38 will be included when known in frequency studies downloaded in the leapdna format. I think these can be especially useful for mutation models and for crossover simulations as I know some R libraries do. My intention is to make it easy to read leapdna files in R by writing and sharing a couple of functions.

thoree commented 3 years ago

Works nicely - thanks you so much!