danStich / worst-r

Textbook for BIOL 217 (Quantitative Biology) and BIOL 678 (Advanced Quantitative Biology) courses
https://danstich.github.io/worst-r/
3 stars 0 forks source link

13.2 - data(crabs) loads a different data file #2

Open bpsut opened 3 days ago

bpsut commented 3 days ago

At the top of chapter 13.2 you mention we will be using the crab data set, but don't show the code to load it. Trying data(crab) returns Error: object 'crab' not found. Maybe a typo? data(crabs) works, but returns: head(crabs)

sp sex index   FL  RW   CL   CW  BD
1  B   M     1  8.1 6.7 16.1 19.0 7.0
2  B   M     2  8.8 7.7 18.1 20.8 7.4
3  B   M     3  9.2 7.8 19.0 22.4 7.7
4  B   M     4  9.6 7.9 20.1 23.1 8.2
5  B   M     5  9.8 8.0 20.3 23.0 8.2
6  B   M     6 10.8 9.0 23.0 26.5 9.8

That does not match your results, so I did a little more digging. It would appear there are now 2 crabs data sets, and at least with my configuration, this guy gets priority.

Good news, I found a work around for now, the "evidence" package looks like it has similar data:

install.packages("evidence")
library(evidence)
data(HSCrab)
head(HSCrab)
Col spineW Width Satell Weight
1   3      3  28.3      8   3050
2   4      3  22.5      0   1550
3   2      1  26.0      9   2300
4   4      3  24.8      0   2100
5   4      3  26.0      4   2600
6   3      3  23.8      0   2100

Given that 3050 = 1000*3.05, I'm guessing this data set is in grams and yours was in kg. This also brings up the annoyance of weight vs mass, but beggars and choosers and all that.

The closest I can currently get is with:

library(evidence)
data(HSCrab)
HSCrab$mass <- HSCrab$Weight/1000
crabs <- HSCrab[c('Col','spineW','Width', 'mass', 'Satell')]
crabs %>%
  dplyr::rename(
    color = Col,
    spine =spineW,
    width = Width,
    satellites = Satell,
    mass = mass
  ) -> crabs
head(crabs)
  color spine width mass satellites
1     3     3  28.3 3.05          8
2     4     3  22.5 1.55          0
3     2     1  26.0 2.30          9
4     4     3  24.8 2.10          0
5     4     3  26.0 2.60          4
6     3     3  23.8 2.10          0

str(crabs)

'data.frame':   173 obs. of  5 variables:
 $ color     : int  3 4 2 4 4 3 2 4 3 4 ...
 $ spine     : int  3 3 1 3 3 3 1 2 1 3 ...
 $ width     : num  28.3 22.5 26 24.8 26 23.8 26.5 24.7 23.7 25.6 ...
 $ mass      : num  3.05 1.55 2.3 2.1 2.6 2.1 2.35 1.9 1.95 2.15 ...
 $ satellites: int  8 0 9 0 4 0 0 0 0 0 ...

From what I can tell, the order of observations is not the same, but hopefully the data is?

danStich commented 3 days ago

Yes, this one is a relatively new annoyance. I started just using a csv with the original data in class until I swap it out.

Dan


Daniel S. Stich (he/him) Associate Professor, Biology Department and Secretary-Treasurer, NY Chapter American Fisheries Society 113A Perna Science, SUNY Oneonta, NY 13820 Office: 607-436-3734 Cell: 518-860-4107 Email: @.*** Website: danstich.github.io/stichhttps://danstich.github.io/stich/index.html


From: Brad Sutliff @.> Sent: Tuesday, November 26, 2024 12:48 PM To: danStich/worst-r @.> Cc: Subscribed @.***> Subject: [danStich/worst-r] 13.2 - data(crabs) loads a different data file (Issue #2)

At the top of chapter 13.2 you mention we will be using the crab data set, but don't show the code to load it. Trying data(crab) returns Error: object 'crab' not found. Maybe a typo? data(crabs) works, but returns: head(crabs)

sp sex index FL RW CL CW BD 1 B M 1 8.1 6.7 16.1 19.0 7.0 2 B M 2 8.8 7.7 18.1 20.8 7.4 3 B M 3 9.2 7.8 19.0 22.4 7.7 4 B M 4 9.6 7.9 20.1 23.1 8.2 5 B M 5 9.8 8.0 20.3 23.0 8.2 6 B M 6 10.8 9.0 23.0 26.5 9.8

That does not match your results, so I did a little more digging. It would appear there are now 2 crabs data sets, and at least with my configuration, this guyhttps://www.rdocumentation.org/packages/MASS/versions/7.3-61/topics/crabs gets priority.

Good news, I found a work around for now, the "evidence" package looks like it has similar data:

install.packages("evidence") library(evidence) data(HSCrab) head(HSCrab)

Col spineW Width Satell Weight 1 3 3 28.3 8 3050 2 4 3 22.5 0 1550 3 2 1 26.0 9 2300 4 4 3 24.8 0 2100 5 4 3 26.0 4 2600 6 3 3 23.8 0 2100

Given that 3050 = 1000*3.05, I'm guessing this data set is in grams and yours was in kg. This also brings up the annoyance of weight vs mass, but beggars and choosers and all that.

The closest I can currently get is with:

library(evidence) data(HSCrab) HSCrab$mass <- HSCrab$Weight/1000 crabs <- HSCrab[c('Col','spineW','Width', 'mass', 'Satell')] crabs %>% dplyr::rename( color = Col, spine =spineW, width = Width, satellites = Satell, mass = mass ) -> crabs head(crabs)

color spine width mass satellites 1 3 3 28.3 3.05 8 2 4 3 22.5 1.55 0 3 2 1 26.0 2.30 9 4 4 3 24.8 2.10 0 5 4 3 26.0 2.60 4 6 3 3 23.8 2.10 0

str(crabs)

'data.frame': 173 obs. of 5 variables: $ color : int 3 4 2 4 4 3 2 4 3 4 ... $ spine : int 3 3 1 3 3 3 1 2 1 3 ... $ width : num 28.3 22.5 26 24.8 26 23.8 26.5 24.7 23.7 25.6 ... $ mass : num 3.05 1.55 2.3 2.1 2.6 2.1 2.35 1.9 1.95 2.15 ... $ satellites: int 8 0 9 0 4 0 0 0 0 0 ...

From what I can tell, the order of observations is not the same, but hopefully the data is?

— Reply to this email directly, view it on GitHubhttps://github.com/danStich/worst-r/issues/2, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIYQJMI33AXOGQ7F2AQL3G32CSX6XAVCNFSM6AAAAABSRANT3GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY4TKNJZGMZDMOI. You are receiving this because you are subscribed to this thread.Message ID: @.***>