IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 102 forks source link

Help information on the R-Instat R packages #9008

Open rdstern opened 3 weeks ago

rdstern commented 3 weeks ago

@rachelkg in the current R-Instat help we have 3 titles for the special R-Instat functions. They are currently under R-Instat Code in the R Packages section of the help. They are as follows:

The Data Book
The Calculation System
Additional Code

Until recently these were just code, inside R-Instat and not R packages, and there was no documentation on any of the functions. This is all changed! So they are like any of the other R packages, except they are not in CRAN. There are one or two other packages not in CRAN, though the main one was mmtable2 and it is no longer being used.

In discussions with @lilyclements these tie nicely with the 3 R packages being constructed on github. I don't like the name "Additional Code" and Lily called it something else? Anyway I report my understanding of the current situation. @lilyclements will update and correct.

Additional Codeis already a github package. It is roughly 50 functions and is documented, also in github. Currently the documentation is on the individual functions, but it will be put into a single document in the next few days. It will be in bookdown, so we can link to the documentation on each function from the current help. This will be in time to be done next week, so can be included in the help for the forthcoming release, which is planned for 24/6/2024.

The Calculation Systemis maybe 20 functions. It will be a separate package and be done similarly. Lily will specify and then get help from INNODEMS to complete this task. It is not expected for this release, but should be linkable for the next release, scheduled for mid/late July 2024.

The Data Book package is perhaps 50 functions. It will be a task in parallel with the Calculation system package.

I consider this to be a big step forward. That's a bit for the help system, but particularly because it shows that R-Instat is a major and innovative project.

lilyclements commented 2 weeks ago

Thanks @rdstern - the R package is currently called standAloneFunctionsR but this can be changed.

The functions up to a certain point are in it (I think up to about six months ago), and the reference manual can be found online here (or can be downloaded from here too) - reference_manual (EDIT: Updated the link above on 12/06/2024 after discussions)

I would just like to thank both @chariga98 and @EVANSTATS for their hard work on this reference manual for the stand alone functions.

I will sort out updating to the most recent functions, it is on the radar!

rdstern commented 2 weeks ago

@lilyclements this looks great - just great. I am copying @rachelkg so we sort out where and how to link to the R-Instat Help and Moodle site.

If I understand correctly we can edit the file on github and then I downloaded as a pdf and all the links remain there. Perfect!

rdstern commented 2 weeks ago

@lilyclements of course the first thing I tried includes a function - I thought reasonably old - that is not in the list, so I wonder how many others? You may remember I asked about the slopegraph. Here is the code:

# Dialog: Line

survey <- data_book$get_data_frame(data_name="survey")
last_graph <- slopegraph(data=survey, x=village, y=yield, colour=variety) + theme_grey() + slopegraph_theme()
data_book$add_object(data_name="survey", object_name="last_graph", object_type_label="graph", object_format="image", object=check_graph(graph_object=last_graph))
data_book$get_object_data(data_name="survey", object_name="last_graph", as_file=TRUE)
rm(list=c("last_graph", "survey"))

Interestingly slopegraph_theme is there, but not slopegraph? Is that an odd exception or does this mean we may be missing others too?

In the existing help I had earleir only found 4 functions so far, (that were called in dialogs where we were adding help. They were: a) duplicated_cases b) duplicated_count_index c) graph_one_variable d) hashed_id

All are in your document, except the graph_one_variable. I wonder if they were manually detected or the "system" is missing the functions in a particular part of the code?

I am still very happy (amazed) with the progress by you and the others and I understand chatGPT, in constructing this guide.

I suggest some edits and @rachelkg has already linked to the document here from the help.

a) Title of the package: There will be three packages, maybe we can agree on all the titles. They are (at least in the Help) The Data Book, The Calculation System and this one.
I am reasonably happy with the first two except that R package names are always one word and don't begin with The. So how about: DataBook and CalculationSystem?

We can have 3 words in the Help, so might keep the words separate.

This one is currently called IDEMSstandAloneFunctionsR I wonder about RInstatExtras, or even InstatExtras as the R could be obvious, or IDEMSExtras if you would like to mention IDEMS? @volloholic do you have a view? @lilyclements what do you think?

Then I note the automatic help has left the Title (one line) and the Description (one paragraph) for us to do. I'll have a go at those here later today. @rachelkg I usually include those in the R-Instat Help. @lilyclements are you happy to include them in the github documentation, or would you prefer that we do it?

lilyclements commented 2 weeks ago

of course the first thing I tried includes a function - I thought reasonably old - that is not in the list, so I wonder how many others?

a) @rdstern interesting. Can you send a list of the ones not on there? I can add them in - should take only about five minutes for each one to add to be honest!

_RDS: The 2 for now would be slopegraph and graph_onevariable

b) To the names of the packages, I'm not sure if the databook one and calculation system one need to be in the same package -- I'll double check and get back to you.

RDS: I am assumong there are 2 separate R packages, namely DataBook, or Databook, and CalculationSystem. So there will be three packages overall, those two plus InstatExtras, or RInstatExtras

c) I like InstatExtras (or RInstatExtras) as the name for the package. I much prefer that to the current name.

RDS: Goodie: Let's rename it then - you in the package and me in the help.

d) Adding the Title and Description is very straightforward. I can do that from my side.

Points a, c, and d are all very straightforward to do and can be done by Friday.

rdstern commented 2 weeks ago

@lilyclements thanks. My responses are inserted above.

lilyclements commented 2 weeks ago

Great. graph_one_variable is from the data_book so is not in the stand alone file. We'll leave this one for now until we work on the data_book package if that's ok?

RDS: Oh that's fine of course.

I've added in slopegraph, a Title and Description, and renamed the package

I've also updated the package so we now should have them all up to wandlplot (the most recent one in the file on my branch).

The new documentation is available here

@lilyclements that's great. Many thanks. This is a great new addition to R-Instat.

@rachelkg this is the version to link to.

lilyclements commented 2 weeks ago

@rdstern I started having a go at the databook. I've done about half of the functions - which is approximately 130!

Here is the manual for it so far

rdstern commented 2 weeks ago

@lilyclements that's great. You say 130 functions. Is that in total, or the half so far???

lilyclements commented 2 weeks ago

@rdstern approx 130 is the number so far! A lot aren’t used in dialogs but are more internal usage. But this is definitely more than I expected!

rdstern commented 2 weeks ago

@lilyclements I saw quite a number of functions in the Extras that looked as though they were for internal use. I often also see that aspect mentioned in R reference guides. Is that done automatically, or could it be something you could ask to be noted? It should be quite easy to note that some functions are called by others, while some tend to do the calling instead?

lilyclements commented 2 weeks ago

@rdstern yes I see your point. I’m pretty certain that can be done very straightforwardly. Do you know which functions are used only internally?

Perhaps I’ll list them all on a spreadsheet and you can give an “x” to the ones which you want to see the help files for? (Or however you’d like to do it). I know you asked for “graph_one_variable” the other day so perhaps you could give an indicator when you find one you do want to include in the help document?

There’s other ways I can sort this out too if that isn’t a good use of your time!

rdstern commented 2 weeks ago

@lilyclements no I want them all, internal or not! I was just hoping chatGPT could detect that feature and include a sentence in the description!

lilyclements commented 2 weeks ago

@rdstern I can try - I guess it's not clear which are called in other dialogs. They might be called both here and in other dialogs (I'm not sure, perhaps the keys related ones, for example?)

lilyclements commented 2 weeks ago

@rdstern I have updated the manual to include a further set of functions. There is just shy of 200 now!

We've also got two other functions added too which aren't part of the data_sheet set, but called in the data sheet.

I need to add in those from the data_book next!

rdstern commented 2 weeks ago

And I'm pretty excited by the latest addition to the data sheet functions, by @N-thony so we have scalars as metadata as well as filters and selects. With the ndata book, I guess we'll approach 400!