GELOG / adam-ibs

Ports the IBS/MDS/IBD functionality of Plink to Spark / ADAM
Apache License 2.0
3 stars 6 forks source link

Calculate the pairwise IBD/IBS metrics - Part II (--genome-full) #4

Open davidonlaptop opened 9 years ago

davidonlaptop commented 9 years ago

Description

Similar to plink --genome-full option. See the wiki on IBS-MDS Process and the diagram for the Genome file.

More information can be found on the --genome and --genome-full options in the section on Pairwise IBD estimation of plink manual.

The input files are those created in #2.

This feature completes feature #3 by adding the missing fields.

Analysis

Add a comment to this issue with:

Add a comment to this issue describing how this will be implemented in Spark, and how it differs from plink.

Also update the class diagram on the wiki page describing PLink formats (when incomplete) and add a class diagram describing the models implemented in Scala for this feature on the wiki page on the MGL804 formats.

Implementation

The implementation should use:

Important note: The model can be only in memory for now, but you'll need to integrate into the ADAM format later on. You'll probably need to create a new record type.