IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 102 forks source link

New scalar calculator dialog and existing calculator also to cope with filters #8953

Closed rdstern closed 5 months ago

rdstern commented 5 months ago

This issue has resulted from three queries I made to David. This is probably largely a @N-thony task, because it is particularly adding sheet- level meta data, and probably @Patowhiz may check and advise. And (sorry Stephen), David keeps mentioning that this improvement links in some way with your work? (Maybe just that this would all be much easier if/when your current structural improvements are included.) The main development is that the metadata for a data sheet will also include scalars. I assume their presence will be clear from a Scalars element in the sheet metadata. It will be like the Label metadata in the Column Metadata, so a list with name = value, etc.

An exciting addition, which links with the simple explanation that a data sheet is a data frame with added metadata. These are a simple example of the type of metadata that can be added. I assume @N-thony will probably do that part, while maybe @Fidel365 can make the changes needed to the calculator, for the additional menu item.

a) R-Instat data sheets are R data-frames plus extra metadata. Could we also have named scalars as further metadata For example In the ordinary calculator we have a yield variable and we could produce the mean(yield). That's 40.58 see below.

image

With the current calculator if we store the result, then it stores another column, with the mean in every row.

The scalar calculator will be a new copy of the calculator dialog. It will look and be almost the same as the existing column calculator. But when storing, it would simply store the resulting scalars in the data-sheet metadata, rather than as columns in the data frame..

This new dialog would go into the Prepare > Dataframe menu, I suggest it could be called Scalar Calculator and go just above Sort, in the same middle section of the menu.

One way to use the scalars could be to have a Scalars button on the above dialog, It could be between the Add and the Data Options button. If pressed, then the data selector gives the list of Scalars to use in the selector. Press again and it goes off and returns to the previous display. This would also be added to the ordinary calculator.

All the keyboards would still be available. Some keys produce columns, but the calculation would still work if it peoduced a summary of the column. An example would be the mean of a column of random numbers.

However, if a calculation produced other than a scalar, then Try would give an error, and ok would give an error. I presume we would trap the error and report it, either as an ordinary error, or a special message particularly if it might be ok if used with the ordinary calculator. So, if no R error, then perhaps check the length(result). If not 1 then there is a problem?

The help ID is 691.

b) Could there be (at least 2 scalars that are always available, and in the list. One is nrow and the other is ncol . They are the number of rows (e.g. 36 for survey) and number of columns. If a filter or select is in operation, they are the number of available rows or columns. They would be in the list of scalars.

c) There would be no problem with the scalar operations acting on the filtered data, when a filter is in operation. I suggest it should do this.

d) Still concerned with a filter, currently the main calculator dialog ignores a filter. There should now be an option to only calculate on the filtered data. Where this produces a new variable, the non-filtered values will be NA. Where it adds to an existing variable it leaves the old values and just changes those in the filter. David says this would be easy to do. It would be via a merge, so would need key variables in the data sheet.

e) There isn't much to do on the dialog side, but there will be additions in the code. In addition to the "new" scalar calculator, the scalars should be added to the Prepare > R Objects dialogs. We need to be able to rename and delete them, like other metadata objects. This may involve @Patowhiz.

f) I hope there will be time for all this in the new version. But @rachelkg I am already starting to write the help!, partly because it simplifies our explanation of what is a data sheet and data book.

g) Also partly for the help, Rachel, the current calculator is already included also in the Data Reshape > General Summaries dialog. Now that is very general, but it includes a situation that is intermediate between the scalar and the column calculators. 1) A column calculation might produce a single value. That's one extreme - and is the new scalar calculator. 2) It might produce a column of the same length as the existing data - that's our existing calculator. 3) It might produce a number of summaries, e.g. mean yield for each village - that's the general summaries dialog!
Neat eh?

Patowhiz commented 5 months ago

If the scalar output object is in a supported format, there is no change required in Prepare > R Objects dialogs.

rachelkg commented 5 months ago

Hello @rdstern, @volloholic, @Patowhiz and @N-thony,

I think it would be better to have just one version of the calculator in the prepare menu as its own item. We have presented this as a complex but powerful dialog, which you use if you are a bit more confident in R-Instat. Therefore it is in plain sight for those who are ready to do this and it includes these three possible outputs.

I would take a away Column: from before Calculator... and add a line after it, so it is clearly its own item. 2024-04-29_9-15-57

Re: "One way to use the scalars could be to have a Scalars button on the above dialog, It could be between the Add and the Data Options button. If pressed, then the data selector gives the list of Scalars to use in the selector. Press again and it goes off and returns to the previous display. This would also be added to the ordinary calculator." I think this sounds fine, though normally for different uses we have tabs across the top. You seem to have three uses for the calculator (see 1, 2 , 3 below) could they be tabs across the top or is the dialog too complicated for that?

  1. A column calculation might produce a single value. That's one extreme - and is the new scalar calculator.
  2. It might produce a column of the same length as the existing data - that's our existing calculator.
  3. It might produce a number of summaries, e.g. mean yield for each village - that's the general summaries dialog!
rdstern commented 5 months ago

@rachelkg I am very happy with your adaptation. It makes the system much simpler. a) We change the menu item from Column: Calculator into Calculator.
b) We include a line in the Prepare menu underneath this item. So it is by itself! (That's like the Data Reshape item lower down. c) In the dialog we add a Scalars button in the Data selector control as mentioned above. d) In the calculator currently, the Try button already shows scalars when the result is a summary number. So does the Output window. So, it is already a 2/3 scalar calculator! We just need it to be able to store those results optionally as scalars. So I suggest an As Scalarcheckbox exactly below the Position button. Default is unchecked, and it returns to the unchecked state when the dialog is re-opened. (It is one of those rare controls that doesn't remember. Alternatively it could remember, because it is automatically disabled unless the calculation results in a scalar.
e) When it is enabled, and one possibility is only to allow it to become enabled after try is used, which would make it easy to check that a scalar is being produced. When enabled, then the Position button is disabled. I am liking this a lot!