Closed david4096 closed 8 years ago
Variant set messages appear as:
message VariantSet {
// The variant set ID.
string id = 1;
// The variant set name.
string name = 2;
// The ID of the dataset this variant set belongs to.
string dataset_id = 3;
// The ID of the reference set that describes the sequences used by the
// variants in this set.
string reference_set_id = 4;
// Optional metadata associated with this variant set.
// This array can be used to store information about the variant set, such as
// information found in VCF header fields, that isn't already available in
// first class fields such as "name".
repeated VariantSetMetadata metadata = 5;
}
The most challenging part of this is translating the info field items to variant set metadata fields. In the variant set metadata element there is a record for each other type of data that are not represented in the variant message itself. For example, you could provide further description for the pathogenicity
key in the variant set metadata item:
key: "pathogenicity"
value: 1
id: "variant-set-id+pathogenicity"
type: "integer"
number: 1
description: Pathogenicity as defined by the brca exchange
This variant set metadata element would then be appended to the metadata
field of a VariantSet
message. The list of descriptions are here.
List of descriptions of annotations https://github.com/BD2KGenomics/brca-website/blob/master/content/help_research.md
Implement variantsets/search according to the schemas. Will hardcode a single variant set. Any fields you're not sure about can ask here.
The hardest part of this task is that we will want to put all of the info field keys from the columns into the metadata.