vincentarelbundock / Rdatasets

A collection of datasets originally distributed in R packages
https://vincentarelbundock.github.io/Rdatasets
Other
323 stars 435 forks source link

Snow.polygons missing from HistData #28

Closed jmcastagnetto closed 1 year ago

jmcastagnetto commented 1 year ago

First of all, thanks for your valuable resource, it makes it easy to point to students to datasets they can use to learn and experiment.

I was looking at the Snow.* datasets from HistData, and noticed that the Snow.polygons data is missing from the list.

vincentarelbundock commented 1 year ago

Glad you like it. Rdatasets only includes standard rectangular data frames in CSV format.

Is that a weird map object?

jmcastagnetto commented 1 year ago

Actually it is a list of 13 dataframes, so it might be considered as a group of CSV files all with xand y columns:

> library(HistData)
> str(Snow.polygons)
List of 13
 $ 1 :'data.frame': 5 obs. of  2 variables:
  ..$ x: num [1:5] 3.39 10.3 9.68 3.39 3.39
  ..$ y: num [1:5] 16.3 16.4 18.7 18.7 16.3
 $ 2 :'data.frame': 5 obs. of  2 variables:
  ..$ x: num [1:5] 10.3 11.19 12.54 9.68 10.3
  ..$ y: num [1:5] 16.4 15.9 18.7 18.7 16.4
 $ 3 :'data.frame': 6 obs. of  2 variables:
  ..$ x: num [1:6] 11.2 11.8 15.1 13.8 12.5 ...
  ..$ y: num [1:6] 15.9 14.7 14.3 18.7 18.7 ...
 $ 4 :'data.frame': 7 obs. of  2 variables:
  ..$ x: num [1:7] 15.1 16.6 16.8 19.9 19.9 ...
  ..$ y: num [1:7] 14.3 13.7 13.7 15.3 18.7 ...
 $ 5 :'data.frame': 7 obs. of  2 variables:
  ..$ x: num [1:7] 3.39 11.21 11.79 11.19 10.3 ...
  ..$ y: num [1:7] 13.4 14 14.7 15.9 16.4 ...
 $ 6 :'data.frame': 6 obs. of  2 variables:
  ..$ x: num [1:6] 3.39 6.17 10.16 11.21 3.39 ...
  ..$ y: num [1:6] 8.83 8.88 10.23 14.02 13.4 ...
 $ 7 :'data.frame': 8 obs. of  2 variables:
  ..$ x: num [1:8] 10.2 11.8 14.3 16.6 15.1 ...
  ..$ y: num [1:8] 10.23 9.52 10.16 13.69 14.27 ...
 $ 8 :'data.frame': 6 obs. of  2 variables:
  ..$ x: num [1:6] 12.5 12.65 11.75 10.16 6.17 ...
  ..$ y: num [1:6] 4.36 4.7 9.52 10.23 8.88 ...
 $ 9 :'data.frame': 5 obs. of  2 variables:
  ..$ x: num [1:5] 12.6 15.6 14.3 11.8 12.6
  ..$ y: num [1:5] 4.7 7.21 10.16 9.52 4.7
 $ 10:'data.frame': 6 obs. of  2 variables:
  ..$ x: num [1:6] 15.6 18.2 16.8 16.6 14.3 ...
  ..$ y: num [1:6] 7.21 6.95 13.75 13.69 10.16 ...
 $ 11:'data.frame': 5 obs. of  2 variables:
  ..$ x: num [1:5] 18.2 19.9 19.9 16.8 18.2
  ..$ y: num [1:5] 6.95 5.87 15.28 13.75 6.95
 $ 12:'data.frame': 8 obs. of  2 variables:
  ..$ x: num [1:8] 12.6 12.5 12.5 19.9 19.9 ...
  ..$ y: num [1:8] 4.7 4.36 3.23 3.23 5.87 ...
 $ 13:'data.frame': 6 obs. of  2 variables:
  ..$ x: num [1:6] 3.39 12.49 12.5 6.17 3.39 ...
  ..$ y: num [1:6] 3.23 3.23 4.36 8.88 8.83 ...

If included in your list, that could be collapsed using:

> library(dplyr)
> library(HistData)
> snow_polygons <- bind_rows(Snow.polygons, .id = "polygon")
> head(snow_polygons, 10)
        polygon         x        y
221...1       1  3.390000 16.32146
22...2        1 10.296353 16.42222
19...3        1  9.678367 18.72500
4...4         1  3.390000 18.72500
2211          1  3.390000 16.32146
191           2 10.296353 16.42222
18...7        2 11.194356 15.85294
181...8       2 12.542312 18.72500
19...9        2  9.678367 18.72500
1911          2 10.296353 16.42222

Cheers

vincentarelbundock commented 1 year ago

Ah thanks for the info. That makes sense.

The scraping script does not support non-standard structures like this. I may accept a Pull Request if someone wants to improve the code, but I am very unlikely to implement this myself.

Sorry.