VEuPathDB / EdaNewIssues

0 stars 0 forks source link

Line plot master ticket: markers and connection lines #602

Open danicahelb opened 1 year ago

danicahelb commented 1 year ago

This is the master ticket for related line plot tickets https://github.com/VEuPathDB/web-eda/issues/1411, https://github.com/VEuPathDB/web-eda/issues/1519 and https://github.com/VEuPathDB/web-eda/issues/1082, as fixes to one issue are having downstream affects and creating other issues.

I will use this hypothetical example:

X- axis variable Timepoint

Y-axis variable Malaria

This is what we want:

  1. No markers

    • There should be no markers whenever there is no data for a given X-axis value in the entire dataset NOR whenever there is no data for a given X-axis value in the subset, regardless of how the y-axis proportion is configured
    • Ex: no markers at Timepoint = 2, 3. Lines should connect Timepoint 1 to Timepoint 4.
  2. Filled markers

    • There should be filled markers whenever there IS data for a given X-axis value as long as there is at least 1 value for the Y-axis in the subset AND the Y-axis proportion contains all possible values in the denominator
    • Ex: no markers at Timepoint = 2, 3 and Filled markers at Timepoints 1, 4, 5, 6, and 7 when Malaria proportion = Yes/(Yes+No+Not tested). Lines should connect Timepoint 1 to Timepoint 4 to Timepoint 5 to Timepoint 6 to Timepoint 7
  3. Hollow markers

    • There should be hollow markers whenever there IS data for a given X-axis value as long as there is at least 1 value for the Y-axis in the subset BUT the Y-axis proportion does NOT contain all possible values in the denominator
    • _Ex: no markers at Timepoint = 2, 3 and Hollow markers at Timepoint = 4 and filled markers at Timepoint= 1, 5, 6, and 7 when Malaria proportion = Yes/(Yes+No). There should be a break in the line whenever there is a hollow marker. So the filled marker at Timepoint = 1 would NOT be connected to Timepoint 4, and Timepoint 4 would not be connected to Timepoint 5. But a line would connect Timepoint 5 to Timepoint 6 to Timepoint 7_
danicahelb commented 1 year ago

https://github.com/VEuPathDB/web-eda/issues/1411 was created because markers were appearing at X-axis values that were not in the subset, which made it look like the Y-axis proportion for these X-axis values was equal to 0 instead of not having any rows of data for these X-axis values

https://github.com/VEuPathDB/web-eda/issues/1519 was created because a connecting line should be drawn through the markers. Lines should be drawn even in instances where an X-axis value has data in the full dataset but does NOT contain a marker because it does not have data in the subset (In that case, the line should be drawn between the 2 markers immediately adjacent to the X-axis value that does not have data in the subset).

Unrelated to these tickets, https://github.com/VEuPathDB/web-eda/issues/1082 was created because markers for X-axis values that were IN the subset but did not have any Y-axis values that were included in the proportion calculation (ie, had an undefined proportion, 0/0) looked like they had a proportion equal to 0. we wanted some sort of marker at these X-axis values to indicate that the subset does contain data for that X-axis value and the proportion configurations do not remove this data from the subset. Connecting lines should NEVER go through hollow markers. If 2 filled markers are separated by a hollow marker, there will be a break in the line

danicahelb commented 1 year ago

the Y-axis proportion does NOT contain all possible values in the denominator

This means the same thing as:

the Y-axis proportion does NOT contain data for any selected values in the denominator

For example, y values are A, B, C and the proportion is configured as A/(A+B) instead of A/(A+B+C).

Since the numerator cannot be a superset of the denominator, C is forced to not be in the numerator as well as the denominator. Therefore rows with y=C will be excluded from the plot (though they remain in the subset).

There can be some values of x where y is only ever equal to C. These points are y=0/0

These points where y=0/0 are given hollow markers and no connecting line through them

The only time hollow unconnected points are ever used is where y=0/0

bobular commented 1 year ago

Hi @danicahelb - thank you so much for consolidating everything here.

I think there could be some more discussion either in EDA UX or data viz about breaking the lines.

We may want different behaviour depending on direct vs. indirect filtering (e.g. the new filter-aware behaviours we've talked about)

Not 100% sure of the reasoning myself, and only briefly discussed with Danielle, so I suggest it goes to committee!