IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 103 forks source link

Including a summary on the original data frame #3929

Open rdstern opened 7 years ago

rdstern commented 7 years ago

Currently our summaries are included on a new data frame. In our work at EUMETSAT we needed to then add them back to the original data frame, so they could then be used for further analyses. The merge worked fine, but is non-trivial. And these data frames are linked already.

Danny has wondered if this is already a feature of the General Summaries dialogue? If that is the route, then fine. I need some instruction, which could combine with the help on this dialogue - if that is its final form.

Otherwise (perhaps also) it could be another checkbox "Store Results in Original Data Frame". Default is unchecked. Perhaps the other one (which is first, and still checked by default, could be just "Store Results".

dannyparsons commented 7 years ago

If I check "Store Results in Original Data Frame" do I still get the second data frame as well? Should that be another option? In the long run we hope our "system" means using data at another level is as easy as using the original data itself. It can be done through a merge, but this produces a third data frame. It's not something we've built in as part of our general calculation system at the moment, but I think it would be fairly easy as an additional step when a summary is calculated.

rdstern commented 7 years ago

You write above: If I check "Store Results in Original Data Frame" do I still get the second data frame as well? Should that be another option? Currently in Prepare > Column: Reshape > Column Summaries the first checkbox is "Store Results in Data" and that is checked by default. I was suggesting above that this should remain and be shortened to just "Store Results". That remains the default, i.e. the summary data are stored in a summary data frame. However, with the change you could also "Store Results in Original Data Frame", and (of course) if you un-check the Store Results checkbox, then it would only be in the original. But I assume that usually you would then have both. I realise that with our linking, this wouldn't (in the end) be needed so much. But I suggest it will be used quite a bit - as it is in the Stata procurement data sets, because other packages don't have the facility.

dannyparsons commented 7 years ago

Ok that sounds sensible. What happens in other software that only copes with one sheet when you do the equivalent of our column summaries? Is it automatically merged back to the original or does the summary data frame then replace the original data? How do you go between the original and summary?

rdstern commented 7 years ago

I haven't checked Stata yet. With Minitab and Instat the summaries can come on the same sheet, because columns can be of different lengths. You made me look again at Genstat: image Notice the checkbox at the bottom right which says: "Merge into the Original Sheet". As usual they are pretty sensible! Perhaps we should say "Merge into the Original Data Frame". That would tell users gently that it is a merge!