dlab-berkeley / R-Data-Visualization-Legacy

D-Lab's 3 hour introduction to data visualization with R. Learn how to create histograms, bar plots, box plots, scatter plots, compound figures, and more using ggplot2 and cowplot.
28 stars 20 forks source link

column name change #30

Closed meiqingli closed 2 years ago

meiqingli commented 2 years ago

In R-Data-Visualization-Solutions.Rmd , the 'age' column name of heart.csv might need to be updated. Otherwise encounter errors running the code.

asteves commented 2 years ago

Can you say a bit more about what the error is?

meiqingli commented 2 years ago

Can you say a bit more about what the error is?

When I ran the code, it returns error with 'age' not found. I suspect it is because of the wrong column name which was 'i..age' in the dataframe.

asteves commented 2 years ago

I do not produce this error when I run the solution code. Can you show what code you ran and any session info so I can figure out why we are seeing different results?

meiqingli commented 2 years ago

Here's what I got. So basically all code cells after this produce the same errors.

library(dplyr)
heart <- heart %>% mutate(sex  = as.factor(sex))
B <- ggplot(heart, aes(x = sex, y = chol, fill = sex)) + 
    geom_boxplot() +
    scale_x_discrete(labels = c("Female", "Male")) +      # change x labels
    scale_fill_discrete(name = "Biological Sex",          # change fill legend labels
                 labels = c("Female", "Male")) +
   labs(x = element_blank(),
        y = "Serum cholestoral in mg/dl",
        title = "Boxplot of Serum Cholesteral (mg/dl)",
        subtitle = "by Biological Sex") +
   theme_bw()

C <- ggplot(heart, aes(x = age, y = chol, 
                        color = as.factor(sex), 
                        shape = as.factor(sex))) +
   geom_point(alpha = 0.7) +
   geom_smooth(method = "lm", se = TRUE, lwd = 1) +
   labs(x = "Age",
        y = "Serum cholestoral in mg/dl",
        title = "Serum Cholesteral (mg/dl) by Biological Sex") +
   theme_bw()
asteves commented 2 years ago

Hmm. Again I do not produce that error when I run the Solutions file on my machine.

How are you reading in the heart file? The fact that the names are different on your machine will produce the errors you see because the age column does not exist. Since nothing in the code you posted is reading in the data file, it will not change the names of columns.

On my machine

library(here)
library(ggplot2)
heart <- read.csv(here::here("data/heart.csv"))
names(heart)

will produce

Screen Shot 2022-09-19 at 11 21 20 AM

Minor point. It's helpful to put code in the code blocks because otherwise Github's Markdown parser will make it look strange.

meiqingli commented 2 years ago

hmmmm I'm using the exactly same code as yours but not sure it reads the column name incorrectly...

[image: image.png]

On Mon, Sep 19, 2022 at 11:21 AM Alex @.***> wrote:

Hmm. Again I do not produce that error when I run the Solutions file on my machine.

How are you reading in the heart file? The fact that the names are different on your machine will produce the errors you see because the age column does not exist. Since nothing in the code you posted is reading in the data file, it will not change the names of columns.

On my machine

library(here) library(ggplot2) heart <- read.csv(here::here("data/heart.csv")) names(heart)

will produce [image: Screen Shot 2022-09-19 at 11 21 20 AM] https://user-images.githubusercontent.com/24926205/191087040-f2fe99b0-dade-408e-8c82-78feb67a2e13.png

Minor point. It's helpful to put code in the code blocks because otherwise Github's Markdown parser will make it look strange.

— Reply to this email directly, view it on GitHub https://github.com/dlab-berkeley/R-Data-Visualization/issues/30#issuecomment-1251382226, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIMGIUOJTLBSITLLH4OWTKLV7CVLPANCNFSM6AAAAAAQPWUC5I . You are receiving this because you authored the thread.Message ID: @.***>

asteves commented 2 years ago

That image didn't load. Can you reupload?

meiqingli commented 2 years ago

Screenshot_1 Here is the screenshot.

asteves commented 2 years ago

The error does not appear on Datahub, which leads me to believe that this has something to do with your OS.

Can you upload the session info?

meiqingli commented 2 years ago

Looks like the issue was resolved after I reinstalled R and everything. But now I have a new error running the last cell: Screenshot_1

asteves commented 2 years ago

I'm closing this issue then. Please make a new issue with the last point and your session issue so we can track appropriately