Open rdstern opened 2 years ago
I suggest we are close to a "quick win" on the use of the data selector when there are many variables. In a way it is there already, namely when we have a lot of variables we can use Data Option, within the data frame, and there just define a select, e.g. all variables that start with "ab". Then, returning to the dialogue, that is all that will be shown.
So that is working now! It will improve as we add further options to the select, e.g. all factors, etc. So, nothing extra, these are all needed for the usual use of select.
Can we do more?
a) When you use Data Options it comes to the general dialogue, which is for both filters and selects. Consider adding 2 options to the right-click . One is Use Select
and the other is Define Select
. Then Use Select takes you to the Select option in the Data Options dialogue. Define Select takes you straight to the sub-dialogue to define a new select. Maybe also a third, which is Remove Select, and this is only possible when a select is in operational. The others can still be used to change the Select.
b) Consider implementing the option that is available, but not yet implemented for filter, namely "for this dialogue only".
@shadrackkibet what do you think. Pretty neat way to make considerable progress on the data selector eh?
Here is my latest attempt - it is a minor modification of the current dialogue:
the group box is Called Apply
The first radio button is called Data Frame
and the second remains to be As Subset
Under data frame there are 3 checkboxes
[ ] Data [ ] Selector [ ] Metadata.
The first 2 are checked by default - which is what happens now. The third is unchecked. But I am really keen that we have the option to examine just some of the variable metadata if we wish. Remember there could be 7000 variables.
I also propose that the Prepare > Data Frame > Hide/Show columns dialogue is now redundant and so should be deleted.
@N-thony we are now using selects to start doing loops.
The options above are all (I think) implemented, except we can't see what the formula is - which we can for filters.
More important is a new discovery. There are static selects, where we have fixed the variables in a select. That's when we choose the variables. Other selects are "dynamic"m for example "all numeric variables" might be a select. The interesting issue with that, is simply that after doing a loop the variables may change, because the loop might generate more numeric variables. In some circumstances that's fine, but on other times it may net be what's needed.
So I would like an option to be able to make a dynamic select into a static one! I can see 2 obvious ways of doing this. a) In the main select dialog there could be a checkbox that becomes enabled when a dynamic select is chosen. If checked, then the variables in the select are used, rather than the function. b) When choosing the variables in the subdialog there could be an option to choose a select instead:
Here we add an additional option to select by Selects
If that is chosen then the data selector (on the left) changes to show the existing selects. You choose one and if that Condition is added it simply adds the columns in that select. I am liking this option!
@N-thony is that for you to do quickly or would you prefer to allocate it? If so perhaps to Derek?
In the latest pull request, once a select is defined it is immediately implemented. So, a) the grid only shows those variables b) the selector only allows access to those variables c) The column metadata states which variables are hidden, etc. I would like a richer set of options.
For example, when defining a filter we have the option of making filters from a factor. That makes a set of filters, but doesn't automatically implement them.
I suggest there are situations where we may want to make a set of selections for use later, but not necessarily implement them in the way above.
So, the current Apply label may become Apply Options as it is for the Filter? Then: a) The first (default) option is called
Apply
and that's what is done now. b) Data Frame - would be like the hide we used to have. Maybe we still have? It just hides in the data frame, (and is shown in the column metadata). Maybe it could be calledHide
? c) Selector - hides in the selector, but the data frame still shows everything. d) Store - just stores the select, ready for later use. e) Slightly separate, but perhaps useful for completeness, is Subset - as we have for Filter. (I assume we could have both? So I could have a subset of some variables from filtered data?) Likewise, we could examine the Filter again to make sure it can make a subset of just the unhidden variables.I now realise this isn't quite enough. I can think of situations where it would be useful to hide the hidden variables in the column metadata - even if this isn't implemented immediately.
So I suggest the 5 buttons be on 3 rows in the group box. a) and b) are on the first row c) and d) are on the second row e) Is the third row.
And on the first row there is also a checkbox, default unchecked. The label is
Apply to Column Metadata
.