JuliaStats / RDatasets.jl

Julia package for loading many of the data sets available in R
GNU General Public License v3.0
158 stars 56 forks source link

Orange Dataset is missing #98

Open timothyslau opened 4 years ago

timothyslau commented 4 years ago
  1. When I run: RDatasets.dataset("datasets", "Orange") I get image
tlienart commented 3 years ago

Would you be interested in adding it? If you get the raw CSV from somewhere and a description, I can help you do the rest.

timothyslau commented 3 years ago

I'd be happy to. Code: write.csv(x = Orange, file = "Orange.csv", row.names = F) # export the R dataset zip(zipfile = "Orange.zip", files = "Orange.csv") # zip the csv file ?Orange # print the documentation

Documentation Orange {datasets} R Documentation Growth of Orange Trees Description The Orange data frame has 35 rows and 3 columns of records of the growth of orange trees.

Usage Orange Format An object of class c("nfnGroupedData", "nfGroupedData", "groupedData", "data.frame") containing the following columns:

Tree an ordered factor indicating the tree on which the measurement is made. The ordering is according to increasing maximum diameter.

age a numeric vector giving the age of the tree (days since 1968/12/31)

circumference a numeric vector of trunk circumferences (mm). This is probably “circumference at breast height”, a standard measurement in forestry.

Details This dataset was originally part of package nlme, and that has methods (including for [, as.data.frame, plot and print) for its grouped-data classes.

Source Draper, N. R. and Smith, H. (1998), Applied Regression Analysis (3rd ed), Wiley (exercise 24.N).

Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS, Springer.

Examples require(stats); require(graphics) coplot(circumference ~ age | Tree, data = Orange, show.given = FALSE) fm1 <- nls(circumference ~ SSlogis(age, Asym, xmid, scal), data = Orange, subset = Tree == 3) plot(circumference ~ age, data = Orange, subset = Tree == 3, xlab = "Tree age (days since 1968/12/31)", ylab = "Tree circumference (mm)", las = 1, main = "Orange tree data and fitted model (Tree 3 only)") age <- seq(0, 1600, length.out = 101) lines(age, predict(fm1, list(age = age))) [Package datasets version 4.0.2 Index]

See attachment. Orange.zip

tlienart commented 3 years ago

Great thanks, your PR should follow these steps:

  1. the file should be gzipped (.csv.gz extension) and placed in data/datasets/Orange.csv.gz (note that RDA file is also fine if that's easier)
  2. in doc/datasest/ you should add an Orange.html following the templates of the other ones already in there, this is just a short description that can be copy pasted from the original source
  3. adding a line before this one to indicate the dataset + its dimensions https://github.com/JuliaStats/RDatasets.jl/blob/5ad946e71cd62a209bb0a304df21e55e08b6c8de/doc/datasets.csv#L495

thanks!