Woo-seong commented 5 years ago

안녕하세요. '쉽게 배우는 R 데이터 분석' 책을보고 연습하는 중에 해결되지 않는 부분이 있어서 글로 적어봅니다.

[mac 유저입니다]

R에서 spss설치하는 과정에서 Warning message: In read.spss(file = "Koweps_hpc10_2015_beta1.sav", to.data.frame = T) : Koweps_hpc10_2015_beta1.sav: Compression bias (0) is not the usual value of 100 이러한 메세지가 뜨고,

무시하고 rename을 실행했는데 Error: All arguments must be named Call rlang::last_error() to see a backtrace 이러한 메세지가 떴습니다.

그리고 class를 시키는 과정에서 [1] "numeric"이 아닌 [1] "NULL"이 떴습니다. 어떻게하면 될까요..

youngwoos commented 5 years ago

작성하신 전체 코드를 붙여넣어 주시겠어요? Warning message는 주의할 사항을 알려주는 것이라 무시하고 진행하셔도 괜찮습니다.

Woo-seong commented 5 years ago

답변 감사합니다. Warning message는 무시하고 진행하도록 하겠습니다. 말씀하신 전체코드 아래에 붙여넣었습니다. 제가 이런 프로그래밍은 처음 접하고, 원래 이해하는게 느려서 옆에 설명글들이 많습니다. 이해해주시기 바랍니다..

dir.create("/Users/integrity/Desktop/R_program") #폴더 생성하기 setwd("/Users/integrity/Desktop/R_program") #작업폴더 변경하기 getwd() #작업폴더 확인하기 list.dirs() #폴더/파일 확인하기 list.files() #해당 작업 폴더에 있는 파일과 폴더 목록을 불러오는 함수 data = read.csv("csv_exam.csv",header = T, fileEncoding = "euc-kr") #파일불러오기 Sys.setlocale("LC_ALL", "ko_KR.UTF-8")

------------------------------------------

library(ggplot2) 1#qplot() 함수: 빈도 막대 그래프를 나타낼 수 있는.. x <- c("a","a","b","c") x qplot(x)

------------------------------------------

qplot(data = mpg, x = cty) qplot(data = mpg, x = drv, y = hwy) qplot(data = mpg, x = drv, y = hwy, geom = "line") qplot(data = mpg, x = drv, y = hwy, geom = "boxplot") qplot(data = mpg, x = drv, y = hwy, geom = "boxplot", colour = drv)

?qplot

------------------------------------------

변수 만들기

english <- c(90,80,60,70) # 영어점수 변수설정 english

math <- c(50, 60 , 100, 20) # 수학점수 변수설정 math

데이터 만들기

english, math로 데이터 프레임 생성해서 df_midterm에 할당

df_midterm <- data.frame(english, math) df_midterm

class <- c(1,1,2,2) class

df_midterm <- data.frame(english, math, class) df_midterm

분석하기

mean(df_midterm$english) # df_midterm의 english로 평균산출 mean(df_midterm$math) # df_midterm의 math로 평균산출 df_midterm <- data.frame(english = c(90, 80, 60, 70), math = c(50, 60, 100, 20), class = c(1, 1, 2, 2)) df_midterm

install.packages("readxl") library(readxl)

getwd() setwd("/Users/integrity/Desktop/R_program")

df_exam <- read_excel("excel_exam.xlsx") df_exam

mean(df_exam$english) mean(df_exam$science)

df_exam_novar <- read_excel("excel_exam_novar.xlsx") # 첫번쨰 행을 변수명으로 인식 df_exam_novar df_exam_novar <- read_excel("excel_exam_novar.xlsx", col_names = F) # 첫번째 행을 숫자로 인식. 즉, 가상의 변수이름을 __X로 정해주는 듯 df_exam_novar

df_exam_sheet <- read_excel("excel_exam_sheet.xlsx",sheet = 3) df_exam_sheet

df_csv_exam <- read.csv("csv_exam.csv") df_csv_exam

df_csv_exam <- read.csv("csv_exam.csv",stringsAsFactors = F) #문자가 들어있는 파일을 불러올 땐 df_csv_exam

df_midterm <- data.frame(english = c(90, 80, 60, 70), # 데이터 프레임 한번에 만들기 math = c(50, 60 ,100, 20), class = c(1, 1, 2, 2)) df_midterm

write.csv(df_midterm, file = "df_midterm.csv")

save(df_midterm, file = 'df_midterm.rda') rm(df_midterm) df_midterm

load("df_midterm.rda") df_midterm

df_exam <- read_excel("excel_exam.xlsx")

df_csv_exam <- read.csv("csv_exam.csv")

load("df_midterm.rda")

df_exam df_csv_exam df_midterm

exam <- read.csv("csv_exam.csv")

head(exam)

head(exam, 10)

tail(exam)

tail(exam, 10)

getwd() setwd("/Users/integrity/Desktop/R_program")

tests <- read.csv('seong_test.csv',header=T) tests head(tests) tail(tests) View(tests) dim(tests) str(tests) summary(tests)

install.packages("ggplot2")

mpg <- as.data.frame(ggplot2::mpg) head(mpg) tail(mpg) View(mpg) dim(mpg) str(mpg) ?mpg summary(mpg)

df_raw <- data.frame(var1 = c(1,2,1), var2 = c(2,3,2)) df_raw

install.packages("dplyr") library(dplyr) df_new <- df_raw # 복사본 생성 df_new

df_new <- rename(df_new, v2=var2) # 변수 이름 변경 df_new df_raw

df_newtest <- mpg df_newtest

df_newcopy <- df_newtest rename(df_newcopy, city=cty) # 따로 할경우 rename(df_newcopy, highway=hwy, city=cty) # 같이할 경우

df_newcopy # 근데 왜 저장된 값이 나오질 않지?

df <- data.frame(var1 = c(4, 3, 8), var2 = c(2, 6, 1)) df

df$var_sum <- df$var1 + df$var2 df

df$var_mean <- (df$var1 + df$var2)/2 df

mpg$total <- (mpg$cty + mpg$hwy)/2 head(mpg)

mean(mpg$total)

summary(mpg$total) View(mpg)

hist(mpg$total)

ifelse(mpg$total >= 20, "pass", "fail")

mpg$test <- ifelse(mpg$total >= 20, "pass", "fail") head(mpg, 20)

table(mpg$test)

library(ggplot2) qplot(mpg$test)

mpg$grade <- ifelse(mpg$total >=30, "A", ifelse(mpg$total >=20, "B", "C")) head(mpg, 20) # 뒤에 숫자는 20개만 나타내는 숫자

table(mpg$grade)

qplot(mpg$grade)

mpg$grade2 <- ifelse(mpg$total >= 30, "A", ifelse(mpg$total >= 25, "B", ifelse(mpg$total >= 20, "C", "D")))

head(mpg, 20)

table(mpg$grade2) qplot(mpg$grade2)

midwest <- as.data.frame(ggplot2::midwest) head(midwest, 10) tail(midwest, 10) View(midwest) str(midwest) dim(midwest) summary(midwest)

df_new_midwest <- midwest df_new_midwest

new <- rename(df_new_midwest, total=poptotal, asian=popasian) #새로 변수명을 만들고 거기에대가 타이틀 수정된것을 넣기 head(new,5)

new$value <- (new$total + new$asian)/2 table(new$value) hist(new$value) View(new$value)

------------------------------------------------

ㅎㅖㄹㅣㅋㅣㅁ

getwd() setwd('')

dk <- read.csv('dk.csv', header = T) dk head(dk) tail(dk) str(dk) ####################### angle <- dk$angle

library(ggplot2)

iris hist(iris$Petal.Length, freq = F) lines(density(iris$Petal.Length)) curve(dnorm(iris$Petal.Length, mean = mean(iris$Petal.Length), sd = sd(iris$Petal.Length)))

----------------------------------------------------------

library(ggplot2)

plot(mtcars$wt, mtcars$mpg) # 첫번째 x축, 두번째 y축

plot(pressure$temperature, pressure$pressure, type="l") # l은 line의 약자이다

각 계체를 선으로 이어서 표시하는 방법이다

points(pressure$temperature, pressure$pressure)

points는 각 계체들을 점으로 표시하는거다

lines(pressure$temperature, pressure$pressure/2, col="blue") lines (pressure$temperature, pressure$pressure+50, col="red") points(pressure$temperature, pressure$pressure/2, col="blue") points(pressure$temperature, pressure$pressure+50, col="red")

--------------------------------------------------------------------------------

library(ggvis)

mtcars %>% ggvis(~mpg, ~wt) %>% points()

%>% 계속해서 사용할 것이다

mtcars %>% ggvis(~mpg, ~wt) %>% layer_lines() mtcars %>% ggvis(~mpg, ~wt) %>% layer_smooths() mtcars %>% ggvis(~mpg, ~wt) %>% layer_points() %>% layer_smooths() mtcars %>% ggvis(~mpg, ~wt, fill:="blue") %>% layer_points() %>% layer_smooths()

install.packages("plotly")

install.packages("data.table")

--------------------------------------------------------------------------------

library(dplyr) exam <- read.csv("csv_exam.csv") exam

exam %>% filter(class == 1) #많은 데이터 중에서 '1'이 포함된 무언가를 뽑고 싶을 때 사용함

                        # '%>%'는 물길을 연결하는 수도관처럼 함수들을 연결하는 기능

exam %>% filter(class == 2) exam %>% filter(class != 1) #많은 데이터 중에서 '1'을 제외한 나머지를 뽑고 싶을 때 사용함

exam %>% filter(math > 50) # 초과 exam %>% filter(math < 50) # 미만 exam %>% filter(english >= 80) #이상 exam %>% filter(science <= 80) #이하

exam %>% filter(class ==1 & math >= 50) exam %>% filter(class == 2 & english >= 80)

exam %>% filter(math >= 90 | english >= 90) exam %>% filter(english < 90 | science < 50) exam %>% filter(class == 1 | class == 3 | class == 5)

exam %>% filter(class %in% c(1,3,5)) class1 <- exam %>% filter(class == 1) class2 <- exam %>% filter(class == 2)

mean(class1$math) mean(class2$math)

x <- c(30, 45, 50, 60) xm <- mean(x, trim = 0.10) c(xm, mean(x, trim = 0.10)) trim <- NULL

exam %>% select(math) #하나의 변수 추출하기 exam %>% select(english) #하나의 변수 추출하기

exam %>% select(math, english, science) exam %>% select(-math, -english) exam %>% filter(class == 1) %>% select(english) # dplyr 함수들을 이용하여 데이터 추출하기

exam %>% # 바로 위의 코드와 동일하지만, 코드를 알아보기 쉽게 만드는 방법 filter(class == 1) %>% select(english)

exam %>% select(science, math) %>% head(10)

--------------------------------------------------------------------------------

exam %>% arrange(science) # 오름차순으로 정리 exam %>% arrange(desc(math)) # 내림차순으로 정리 exam %>% arrange(class) exam %>% arrange(desc(class & math & science)) exam %>% arrange(desc(class))

--------------------------------------------------------------------------------

18. November. 2018.

exam %>% # %>%는 mutate와 연결 mutate(total = math + english + science) %>% # mutate: 변수추가 (wooseong은 변수이름); %>%는 head와 연결 head

exam %>% mutate(total = math + english + science, mean = (math + english + science)/3) %>% #평균은 모든 숫자를 더하고, 갯수만큼 나누기 때문에 '/3'을 사용 head

mutate에 ifelse적용하기

exam %>% mutate(test = ifelse(science >= 60, "pass", "fail")) %>% #60과 같거나, 큰 점수를 pass & fail로 구분짓는 것 head

exam %>% mutate(total = math + english + science) %>% arrange(total) %>% head

exam %>% summarise(mean_math = mean(math)) #summarise는 집단별로 요약; mean은 전체의 평균

exam %>% group_by(class) %>% #class별로 분리 summarise(mean_math = mean(math)) #math 평균 산출

'='을 기준으로 앞에 있는 문자는 colum(행)의 변수명을 뜻하고, 뒤는 함수를 사용하는 것이다.

exam %>% group_by(class) %>% #class별로 분리 summarise(mean_math = mean(math), # math 평균 sum_math = sum(math), # math 합계 median_math = median(math), # math 중앙값 n = n()) # 학생 수

mpg %>% group_by(manufacturer, drv) %>% #회사별, 구동 방식별 분리 summarise(mean_cty = mean(cty)) %>% #cty 평균 산출 head(12) # 일부 출력

mpg %>% group_by(manufacturer) %>% #회사별로 분리 filter(class == "suv") %>% #suv 추출 mutate(tot = (cty + hwy)/2) %>% #통합 연비 변수 생성 summarise(mean_tot = mean(tot)) %>% #통합 연비 평균 산출 arrange(desc(mean_tot)) %>% #내림차순 정렬 head(10) #1~5위까지 출력

View(mpg)

"data_frame"을 아래처럼 생성하게되면,

첫번째 코드의 숫자는 first column of class를 뜻하고, 두번째 코드는 second column of score을

뜻한다.

데이터 '가로'로 합치기

중간고사 데이터 생성

test1 <- data.frame(id = c(1,2,3,4,5), midterm = c(60, 80, 70, 90, 85))

test2 <- data.frame(id = c(1,2,3,4,5), final = c(70, 83, 65, 95, 80)) test1 test2

total <- left_join(test1, test2, by = "id") # 'left_join()은 가로로 합침';'by'는 기준으로 삼을 변수명을 지정하는 것 total

name <- data.frame(class = c(1,2,3,4,5), teacher = c("kim", "lee", "park", "choi", "jung")) name

exam_new <- left_join(exam, name, by = "class") exam_new

데이터 '세로'로 합치기

세로로 합칠때 주의사항: 변수명 (id, test)이 같아야 한다. 만약 다를 경우, "rename()"을 사용해서 동일하게

맞춘다.

group_a <- data.frame(id = c(1,2,3,4,5), test = c(60, 80, 70, 90, 85))

group_b <- data.frame(id = c(6,7,8,9,10), test = c(70, 83, 65, 95, 80))

group_a group_b

group_all <- bind_rows(group_a, group_b) #데이터 합셔처 'group_all'에 할당 group_all

19. November. 2018.

df <- data.frame(sex = c("M", "F", NA, "M", "F"), score = c(5,4,3,4,NA)) # 'NA'뒤에 따옴표가 없으면 결측치(missing value)가 아니라 영문자 'NA'이다. df

is.na(df) # Is it NA?라는 질문으로 'is.na()'이를 사용하면, True & False로 missing value를 알 수있다. table(is.na(df)) #'table'을 적용하면 ture & false가 몇개 있는지를 알 수있다.

table(is.na(df$sex)) # 'table(is.na()'에 해당 변수명을 지정한다면, missing value를 확인 할 수 있다. table(is.na(df$score))

mean(df$score) #missing value가 포함된 변수에는 정상적인 연산이 되지않고 NA가 출력된다. sum(df$score)

library(dplyr) df %>% filter(is.na(score)) # score가 NA인 데이터만 출력 df %>% filter(!is.na(score)) # '!'는 아니다라는 의미이다; missing value를 제외하고 score에서 데이터 출력

df_nomiss <- df %>% filter(!is.na(score)) mean(df_nomiss$score) sum(df_nomiss$score)

df_nomiss <- df %>% filter(!is.na(score) & !is.na(sex)) df_nomiss # 'filter + &'을 사용하게 되면 다른 변수들을 추가하여, 사용할 변수들만 추출한다.

df_nomiss2 <- na.omit(df) df_nomiss2 # missing value가 포함되어 있는 행을 모두 제거한다. 즉, 필요한 행까지 다 제거된다.

mean(df$score, na.rm = T) # 'na.rn'은 NA remove라는 뜻으로, missing value를 제거하라는 의미이다. sum(df$score, na.rm = T)

exam <- read.csv("csv_exam.csv") exam[c(3,8,15), "math"] <- NA # 3,5,15행의 math에 NA 할당 exam # 대괄호는 데이터의 위치를 지징하는 역할; 대괄호 안에서 쉼표의 왼쪽은 '행 위치', 오른쪽은 '열 위치'

exam %>% summarise(mean_math = mean(math)) exam %>% summarise(mean_math = mean(math, na.rm = T)) exam %>% summarise(mean_math = mean(math, na.rm = T), sum_math = sum(math, na.rm = T), median_math = median(math, na.rm = T))

mean(exam$math, na.rm = T) #결측치 제외하고 math평균 산출

exam$math <- ifelse(is.na(exam$math), 55, exam$math) # math가 NA면 55로 대체 table(is.na(exam$math)) # 결측치 빈도표 생성

exam

mean(exam$math)

outlier <- data.frame(sex = c(1,2,1,3,2,1), # 결측치 만들어내기 score = c(5,4,3,4,2,6)) outlier

table(outlier$sex) # 결측치 확인하기 table(outlier$score)

outlier$sex <- ifelse(outlier$sex == 3, NA, outlier$sex) #만약 sex에 숫자 3이면, NA를 나타내어라 outlier

outlier$score <- ifelse(outlier$score > 5, NA, outlier$score) outlier

outlier %>% filter(!is.na(sex) & !is.na(score)) %>% # sex & score 모두 이상치를 결측치로 변환되었으니, 'filter'을 이용하여 결측치를 제외한 후, group_by(sex) %>% summarise(mean_score = mean(score)) # 성별에 따른 score평균 구하기.

boxplot(mpg$hwy) boxplot(mpg$hwy)$stats # 상자그림 통계치 출력

mpg$hwy <- ifelse(mpg$hwy < 12 | mpg$hwy > 37, NA, mpg$hwy) # 12 ~ 37 벗어나면 NA 할당 table(is.na(mpg$hwy))

mpg %>% group_by(drv) %>% summarise(mean_hwy = mean(hwy, na.rm = T))

26. November. 2018.

library(ggplot2) ggplot(data = mpg, aes(x = displ, y = hwy)) # x축은 displ, y축은 hwy로 지정해 배경 생성 ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point() # 배경에 삼전도 추가 ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point() + xlim(3,6) # x행을 3~6까지 지정 ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point() + xlim(3,6) + ylim(10,30) # y행을 10~30까지 지정

library(dplyr)

df_mpg <- mpg %>% group_by(drv) %>% summarise(mean_hwy = mean(hwy)) #구동방식별 고속도로 평균연비

ggplot(data = economics, aes(x = date, y = unemploy)) + geom_line()df_mpg

ggplot(data = df_mpg, aes(x = drv, y = mean_hwy)) + geom_col() #구동방식별 고속도로 평균연비의 그래프 나타내기

ggplot(data = df_mpg, aes(x = reorder(drv, -mean_hwy), y = mean_hwy)) + geom_col() # geom_col은 요약한 자료를 그래프로 나타냄 ggplot(data = mpg, aes(x = drv)) + geom_bar() # geom_bar는 원자료를 그대로 그래프로 나타냄 ggplot(data = mpg, aes(x = hwy)) + geom_bar()

ggplot(data = economics, aes(x = date, y = unemploy)) + geom_line()

ggplot(data = mpg, aes(x = drv, y = hwy)) + geom_boxplot()

----------------------------------------------------------------------------------------------------------------------------

install.packages("foreign") install.packages('readxl')

library(foreign) library(dplyr) library(ggplot2) library(readxl)

raw_welfare <- read.spss(file = "Koweps_hpc10_2015_beta1.sav", to.data.frame = T) welfare <- raw_welfare

2019년 2월 11일 (월) 오전 9:39, Youngwoo Kim notifications@github.com님이 작성:

작성하신 전체 코드를 붙여넣어 주시겠어요? Warning message는 주의할 사항을 알려주는 것이라 무시하고 진행하셔도 괜찮습니다.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/youngwoos/Doit_R/issues/13#issuecomment-462196959, or mute the thread https://github.com/notifications/unsubscribe-auth/AtCGa48k9fmpZK3nir2lQemz_nIpZEzVks5vMLvCgaJpZM4aeN9y .

youngwoos commented 5 years ago

spss 파일을 불러오는 코드는 오류없이 잘 작동하네요. 그 뒤로 작성하신 코드를 붙여넣어 주시겠어요?

Woo-seong commented 5 years ago

2019년 2월 14일 (목) 오후 12:03, Youngwoo Kim notifications@github.com님이 작성:

spss 파일을 불러오는 코드는 오류없이 잘 작동하네요. 그 뒤로 작성하신 코드를 붙여넣어 주시겠어요?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/youngwoos/Doit_R/issues/13#issuecomment-463468587, or mute the thread https://github.com/notifications/unsubscribe-auth/AtCGa3mS9LSOJNWrEleQvPhTCXVB82VRks5vNNIYgaJpZM4aeN9y .

Woo-seong commented 5 years ago

그 이후론 코딩 연습을 못하고 있었습니다.

원래 제가 맥을 사용했었는데, 윈도우에 R을 설치해서 오류가 발생하는 부분을 돌려보니 아래와 같은 메세지가 떳습니다.

Warning message: In read.spss(file = "Koweps_hpc10_2015_beta1.sav", to.data.frame = T) : Koweps_hpc10_2015_beta1.sav: Compression bias (0) is not the usual value of 100

이 부분은 그냥 무시해도 되는 부분인지 여쭤보고 싶습니다. 그리고 저 에러는 어떤 것을 의미하는지 알고 싶습니다.

만약 저 에러메세지가 문제가 없다면, 이어서 코딩연습을 진행하려고 합니다.. 확인 부탁드리겠습니다.

2019년 2월 17일 (일) 오후 10:58, 허우성 wooseong00@gmail.com님이 작성:

2019년 2월 14일 (목) 오후 12:03, Youngwoo Kim notifications@github.com님이 작성:

spss 파일을 불러오는 코드는 오류없이 잘 작동하네요. 그 뒤로 작성하신 코드를 붙여넣어 주시겠어요?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/youngwoos/Doit_R/issues/13#issuecomment-463468587, or mute the thread https://github.com/notifications/unsubscribe-auth/AtCGa3mS9LSOJNWrEleQvPhTCXVB82VRks5vNNIYgaJpZM4aeN9y .

Woo-seong commented 5 years ago

그리고, 그냥 에러메세지를 무시하고 돌렸을 경우 아래와 같은 내용이 떳습니다.

이부분 또한 확인부탁드립니다..

welfare <- rename(welfare,

sex = h10_g3, #성별

birth = h10_g4, #태어난 연도

marriage = h10_g10, #혼인상태

religion = h10_g11, #종교

income - p1002_8aq1, #월급

code_job = h10_eco9, #직업코드

code_region = h10_reg7) #지역코드 Error: All arguments must be named Call rlang::last_error() to see a backtrace class(welfare$sex) [1] "NULL"

2019년 2월 17일 (일) 오후 11:01, 허우성 wooseong00@gmail.com님이 작성:

그 이후론 코딩 연습을 못하고 있었습니다.

원래 제가 맥을 사용했었는데, 윈도우에 R을 설치해서 오류가 발생하는 부분을 돌려보니 아래와 같은 메세지가 떳습니다.

Warning message: In read.spss(file = "Koweps_hpc10_2015_beta1.sav", to.data.frame = T) : Koweps_hpc10_2015_beta1.sav: Compression bias (0) is not the usual value of 100

이 부분은 그냥 무시해도 되는 부분인지 여쭤보고 싶습니다. 그리고 저 에러는 어떤 것을 의미하는지 알고 싶습니다.

만약 저 에러메세지가 문제가 없다면, 이어서 코딩연습을 진행하려고 합니다.. 확인 부탁드리겠습니다.

2019년 2월 17일 (일) 오후 10:58, 허우성 wooseong00@gmail.com님이 작성:

2019년 2월 14일 (목) 오후 12:03, Youngwoo Kim notifications@github.com님이 작성:

spss 파일을 불러오는 코드는 오류없이 잘 작동하네요. 그 뒤로 작성하신 코드를 붙여넣어 주시겠어요?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/youngwoos/Doit_R/issues/13#issuecomment-463468587, or mute the thread https://github.com/notifications/unsubscribe-auth/AtCGa3mS9LSOJNWrEleQvPhTCXVB82VRks5vNNIYgaJpZM4aeN9y .

youngwoos commented 5 years ago

Warning message는 에러는 아니고 주의할 사항을 알려주는 것이라 무시하고 진행해도 괜찮습니다. 용량이 큰 SPSS 파일을 불러오면 해당 워닝이 뜨는데 정확한 원인은 저도 모르겠습니다.

올려주신 코드에서 파일을 불러오는 부분에는 오류가 없습니다. 다운로드하신 파일의 용량이 얼마나 되는지 확인해보시겠어요? 아니면 sav 파일을 새로 다운로드 받아서 다시 불러오는 작업을 해보시겠어요?

아래 페이스북 커뮤니티에 질문을 올리시면 좀 더 빨리 답변을 받아보실 수 있습니다.

데이터 분석 커뮤니티 https://facebook.com/groups/datacommunity

감사합니다.

Woo-seong commented 5 years ago

네 알겠습니다. 감사합니다! 좋은 주말보내세요!

2019년 2월 20일 (수) 오후 4:52, Youngwoo Kim notifications@github.com님이 작성:

Warning message는 에러는 아니고 주의할 사항을 알려주는 것이라 무시하고 진행해도 괜찮습니다. 용량이 큰 SPSS 파일을 불러오면 해당 워닝이 뜨는데 정확한 원인은 저도 모르겠습니다.

올려주신 코드에서 파일을 불러오는 부분에는 오류가 없습니다. 다운로드하신 파일의 용량이 얼마나 되는지 확인해보시겠어요? 아니면 sav 파일을 새로 다운로드 받아서 다시 불러오는 작업을 해보시겠어요?

아래 페이스북 커뮤니티에 질문을 올리시면 좀 더 빨리 답변을 받아보실 수 있습니다.

데이터 분석 커뮤니티 https://facebook.com/groups/datacommunity

감사합니다.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/youngwoos/Doit_R/issues/13#issuecomment-465464088, or mute the thread https://github.com/notifications/unsubscribe-auth/AtCGa2ZLULxEIa6gXVqTqmKwO0HT_pSvks5vPP7agaJpZM4aeN9y .

youngwoos / Doit_R

Koweps_hpc10_2015_beta1.sav #13

------------------------------------------

------------------------------------------

------------------------------------------

변수 만들기

데이터 만들기

english, math로 데이터 프레임 생성해서 df_midterm에 할당

분석하기

------------------------------------------------

ㅎㅖㄹㅣㅋㅣㅁ

----------------------------------------------------------

각 계체를 선으로 이어서 표시하는 방법이다

points는 각 계체들을 점으로 표시하는거다

--------------------------------------------------------------------------------

%>% 계속해서 사용할 것이다

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

18. November. 2018.

mutate에 ifelse적용하기

'='을 기준으로 앞에 있는 문자는 colum(행)의 변수명을 뜻하고, 뒤는 함수를 사용하는 것이다.

"data_frame"을 아래처럼 생성하게되면,

첫번째 코드의 숫자는 first column of class를 뜻하고, 두번째 코드는 second column of score을

데이터 '가로'로 합치기

중간고사 데이터 생성

데이터 '세로'로 합치기

세로로 합칠때 주의사항: 변수명 (id, test)이 같아야 한다. 만약 다를 경우, "rename()"을 사용해서 동일하게

19. November. 2018.

26. November. 2018.

----------------------------------------------------------------------------------------------------------------------------