chop-dbhi / cilantro

Official client for Harvest (http://harvest.research.chop.edu)
http://cilantro.harvest.io
Other
28 stars 8 forks source link

Handling of labels mapping to multiple values in Lexicon fields #749

Open murphyke opened 9 years ago

murphyke commented 9 years ago

To what extent should Harvest support Lexicon fields with labels mapping to multiple values, or should this be treated as an error or warning condition (showing up in avocado check, perhaps)?

An example is the Genotype Lexicon from Varify Data Warehouse , which is populated such that two of the labels each map to two values.

Currently what happens depends on whether the field is enumerable or not. If the genotype_id field (vdw.samples.Result) is enumerable, the histogram widget shows a separate entry for each redundant label, and the user doesn't know how to distinguish them. (This is what Varify users experience).

If the field is not enumerable, then just one instance of each redundant label is shown in the search list, which is worse - the user might think she is filtering on all 'Het Ref' genotypes, but in fact is only matching some of them.

It's possible to imagine Harvest automatically translating between a single label and multiple values, but ... should it?

bruth commented 9 years ago

A lexicon should have distinct values with distinct labels. It is up to the maintainer of the lexicon to disambiguate or merge the values in the lexicon itself. On a related note, django-lexicon integration has been removed in Avocado 2.4 since supplementary fields have been introduced.