Closed john-harrold closed 1 year ago
Tentative solutions: Try to add hover over on column headers to include information like data type, factor etc. Include:
Another or complementary solution is to change the background colors of headers or columns of data to indicate data type to the user.
Formatting headers:
Tooltips with headers: https://gist.github.com/timelyportfolio/b8001318ce3e25b6920a0f20e9db374e
Hey @billdenney. How does this look:
It's configured in the yaml file below under labels
. if df
is your data frame then you put whatever typeof(df$colname)
returns under the data_types
below. So defaults can be created in the package and then can be customized by the user if needed.
labels:
data_types:
character:
color: "green"
label: "text"
double:
color: "blue"
label: "num"
other:
color: "black"
label: "other"
I can also decrease the font size of the datatype to make it less obtrusive.
I like it with the decreased font size (as long as it remains readable for people who need larger fonts). Perhaps the relative font size could be a yaml option. 😉
Here I just pulled out the entire column format into the yaml file. I'll incorporate it into both the upload data module and the data wrangling module.
Do you have any thoughts on default colors? Blue and green are probably not good given color blindness and all that.
# This controls the overall format of headers for data files with
# the following placeholders surrouned by ===:
# COLOR - font color
# NAME - colum name
# LABEL - type label
data_header: "<span style='color:===COLOR==='><b>===NAME===</b><br/><font size='-3'>===LABEL===</font></span>"
Following on along this path. Do you think it would be helpful to have the same information in the plot aesthetic controls. For example in the selection box below it's possible to change the colors and have the type information in a smaller font to the right.
Visually it would look something like this but with colors:
For the palette, this page suggests blue/red as the pair to use. Maybe add black as the "other" group, too. (https://www.datylon.com/blog/data-visualization-for-colorblind-readers#color-blind-palette)
For the graphs, I would lean toward making color selection and advanced setting that applies to everything rather than a per-graph option.
What I was talking about was just the formatting of the form elements. Should those form elements that select columns from the dataset also reflect the data type information in the same way as the column headers in the data preview.
By the way this is what it looks like now:
I've overlaid the colorblind view using Sim Daltonism. I'm trying to reuse blues and reds from the buttons/pulldowns. I think it looks pretty good now.
I don't think it's necessary to have the data type in the pulldown, but I'm not sure about the opinion. (I.e. maybe I'd change my mind with more thought.)
I like the look.
I'm thinking something like this when mapping columns to aesthetics in the figure generation module. My thought is that folks won't remember type information from the preview in the data wrangling module. So I can show them when they are selecting the columns:
This is more to give you an idea of what can be included. I'm looking for something like: This shows the user the type of data and for certain types of data (numeric here) it can show you some information about it (min/max).
I'm not sure if I should do anything with text data.
I see use for that.
For text, I would only show unique values if there were just a couple of them. Otherwise, just saying that it's text should be good enough, to me.
Ok this is looking good. In the config file you can do something like this:
subtext: "===LABEL===: ===RANGE==="
The LABEL is the numeric, text or whatever you want to use. For range I take the sorted unique values of the column and if there are more than 3 I replace it with LOWER, .... HIGHER. If there are <= 3 then I just join those with ", ".
I think I'm happy with this.
That looks good to me!
This is done and applied to the headers of the preview tables and the subtext in the column selects in both formods (https://github.com/john-harrold/formods/commit/4220940faf07548cb75078dbe1fa4ae9ed370db0) and ruminate (https://github.com/john-harrold/ruminate/commit/f7cb3a6f7d4aad4b1c1a3cfa214e357289e67a59).
On import, is there a way to confirm that a column imported with the expected class (numeric, character, date, etc.)? I don’t see it, and it brings in many small issue reports over time. As long as it is handled at some point (either in the visualize or the NCA module), it should be okay, but it could cause some headaches to go back and fix it. @billdenney