nmdp-bioinformatics / pipeline

Consensus assembly and allele interpretation pipeline.
GNU Lesser General Public License v3.0
7 stars 7 forks source link

Add cross-validation script #88

Closed ckennedy-nmdp closed 9 years ago

ckennedy-nmdp commented 9 years ago

Cross validates expected and observed interpreted alleles. Expected file format is tab-delimited with following fields: (i) Absolute path to aligned consensus sequences. These will be present in final/ upon successful pipeline completion (ii) Locus (iii) Absolute path to interpreted region file (iv) Expected zygosity (v) First interpreted allele in HLA-nomenclature (vi) Second interpreted allele in HLA-nomenclature

The observed file is tab-delimited with the following fields: (i) Absolute path to aligned consensus sequences -- these provide the mapping to the correct expected entry above (ii) Gl string representing the interpreted allele (one per consensus sequence) (iii) The consensus sequence itself

Parameter 'z' indicates the field of resolution to consider valid between expected and observed interpreted alleles