Closed laceysanderson closed 5 years ago
The Chart should also have the following options for the user:
Handle different units properly. Specifically, ensure that units are never combined and that the user can select which one they want to see.
Units & methods are now handled properly
Violin plots are complete! Now to move on to qualitative trait plots.
Use a grouped bar chart where the x-axis is the categories, the series are the site years and the y-axis is the number of germplasm showing that phenotype. Since there is already a well-tested d3.js chart for this, it will be fast to implement and I think it will be intuitive to users.
I would love to make vertical bar charts! These would mimic the layout of the violin plots with sites years on the x-axis, categories on the y-axis and number of germplasm being the length of the bar. This would make the charts less disorienting when switching between traits. It would also be less cluttered and I feel easier to make comparisons between categories. However, I have yet to find such a chart so it would be a labour of love.
How do we tell which traits are qualitative and which are quantitative?
My concern about choice 1 is that it's a lot of configuration for admin since there will likely be many units. However, choice 2 gives them less control.
Plan: option 2
Current materialized view averages replicates which throws an error with qualitative data.
Ideal Solution: add a qualitative property to the unit Current solution: check the unit name for scale
SELECT
o.genus AS organism_genus,
trait.cvterm_id AS trait_id,
trait.name AS trait_name,
proj.project_id AS project_id,
proj.name AS project_name,
method.cvterm_id AS method_id,
method.name AS method_name,
unit.cvterm_id AS unit_id,
unit.name AS unit_name,
loc.value AS location,
yr.value AS year,
s.stock_id AS germplasm_id,
s.name AS germplasm_name,
CASE
WHEN unit.name~'scale' THEN array_to_string(array_agg(DISTINCT p.value),'/')
ELSE CAST(avg(p.value::float) as text)
END AS mean
FROM chado.phenotype p
LEFT JOIN chado.cvterm trait ON trait.cvterm_id=p.attr_id
LEFT JOIN chado.project proj USING(project_id)
LEFT JOIN chado.cvterm method ON method.cvterm_id=p.assay_id
LEFT JOIN chado.cvterm unit ON unit.cvterm_id=p.unit_id
LEFT JOIN chado.stock s USING(stock_id)
LEFT JOIN chado.organism o ON o.organism_id=s.organism_id
LEFT JOIN chado.phenotypeprop loc ON loc.phenotype_id=p.phenotype_id AND loc.type_id = 2940
LEFT JOIN chado.phenotypeprop yr ON yr.phenotype_id=p.phenotype_id AND yr.type_id = 141
GROUP BY
trait.cvterm_id,
trait.name,
proj.project_id,
proj.name,
method.cvterm_id,
method.name,
unit.cvterm_id,
unit.name,
loc.value,
yr.value,
s.stock_id,
s.name,
o.genus;
Switched to storing the values as JSONB in the materialized view and then calculating the mean in the JSON callback. This provides much more flexibility in how to calculate the value for qualitative traits.
Current ToDo list:
NOTE: Requires you start fresh with data
Moved qualitative chart into new issue.
We should use true violin plots for quantitative (numeric, continuous) traits (e.g. plant height, days to flowering?).
D3.js implementation: http://bl.ocks.org/asielen/92929960988a8935d907e39e60ea8417