MRI_Count: total pixel Count from the 18 MRI scans
1)Makes three histograms for FSIQ, VIQ, PIQ
2)Plot the estimated density for FSIQ, VIQ, PIQ
3)correlation between FSIQ, VIQ and PIQ , correlation between MRI_Count and FSIQ,
correlation matrix for MRI_Count, Weight and Height
4)predict for the linear model : MRI_Count: 80000, 800000 , 8000000 and Height:74, 74, 74
brain_data <- read.table(file = 'BrainSize.tsv',
header = TRUE,
sep = "\t")
#makes three histograms for FSIQ, VIQ, PIQ)
with(brain_data, hist(FSIQ))
with(brain_data, hist(VIQ))
with(brain_data, hist(PIQ))
#Plot the estimated density for FSIQ, VIQ, PIQ
with(brain_data, plot(density(FSIQ),
col ="red",
main = "",
xlab = "FSIQ, VIQ and PIQ "))
with(brain_data, plot(density(VIQ),
col ="blue",
main = "",
xlab = "FSIQ, VIQ and PIQ "))
with(brain_data, plot(density(PIQ),
col ="green",
main = "",
xlab = "FSIQ, VIQ and PIQ "))
#correlation between FSIQ, VIQ and PIQ
cor(brain_data[c("FSIQ", "VIQ", "PIQ")])
#correlation between MRI_Count and FSIQ
with(brain_data, cor(MRI_Count, FSIQ))
#correlation matrix for MRI_Count, Weight and FSIQ
cor(brain_data[c("MRI_Count", "Height", "Weight")])
pairs(brain_data[c("MRI_Count", "Height", "Weight")])
#simple linear regression model between FSIQ and MRI_Count
brain_lm <- lm(FSIQ ∼ MRI_Count, brain_data))
summary(brain_lm)
#linear regression model for FSIQ using both MRI_Count and Height
brain_lm_height <- lm(FSIQ ∼ MRI_Count + Height, brain_data)
summary(brain_lm_height)
#predict for the linear model
brain_new <- data.frame("MRI_Count" = c (80000, 800000 , 8000000) ,
"Height" = c(74, 74, 74) )
predict(brain_lm_height, brain_new)
Additional information
Originality
Please mark relevant information with x, (ex. [x])
Is this question
[x] Original
[ ] Inspired
[ ] Paraphrased or copied
If you select Inspired or Paraphrased please provide the links in markdown format ( [link](http://example.com) ). Please provide all relevant links. You can refer to DataCamp course pages if you're inspired by them.
Difficulty Level
According to you, what is the level of difficulty of the question (note: this can be modified by instructor after submission)
[ ] Easy / Beginner (using a single command or concept is enough to solve the question)
[x] Intermediate (combining multiple commands, concepts is needed to solve the question)
[ ] Difficult (combining multiple commands with non-default options and looking for additional information online might be needed)
Tags (optional)
Please provide comma separated list of dplyr verbs (e.g. summarize, left join) or concepts (e.g. text mining) that you think are relevant with question
Before submitting
Please click Preview and preview your submission and check if it's rendered correctly
Question
You have a Brain Size Dataset. Please download dataset from https://drive.google.com/file/d/0B-__yRckGmXkN1hjTERLU2ZoYXBaOUx3UnExN3EydkhKRFlR/view?usp=sharing The datafile BrainSize.tsv contains 40 samples (rows), and 7 different types measurements/variables (columns):
Additional information
Originality
Please mark relevant information with x, (ex.
[x]
)Is this question
If you select
Inspired
orParaphrased
please provide the links in markdown format ([link](http://example.com)
). Please provide all relevant links. You can refer to DataCamp course pages if you're inspired by them.Difficulty Level
According to you, what is the level of difficulty of the question (note: this can be modified by instructor after submission)
Tags (optional)
Please provide comma separated list of dplyr verbs (e.g.
summarize
,left join
) or concepts (e.g.text mining
) that you think are relevant with questionBefore submitting
Preview
and preview your submission and check if it's rendered correctly