jtleek / modules

97 stars 894 forks source link

maacs dataset #82

Open alfakini opened 10 years ago

alfakini commented 10 years ago

Hi, where could I get maacs dataset? I see you use it on 04_ExploratoryAnalysis/ggplot2/ggplot2_p2.Rmd, but the file isn't in the repository. I am coming from the Computing for Data Analysis on Coursera, there are a lot of people there trying to find the maacs dataset to try the examples presented on the lectures: https://class.coursera.org/compdata-004/class/search?q=maacs#11-state-query=maacs&11-state-filter=all&11-state-page_num=2

Thanks, alf.

chajadan commented 10 years ago

Just a second here that a link or reference to the data would be nice.

chajadan commented 10 years ago

Okay, here goes:

The data is in this RDS file: https://github.com/jtleek/modules/blob/master/04_ExploratoryAnalysis/PlottingLattice/maacs_env.rds

This file opens the data are draws a big set of panels of the level of allergen in the air for each subject over 5 visits (lines 201-204): https://github.com/jtleek/modules/blob/master/04_ExploratoryAnalysis/PlottingLattice/index.Rmd

mlstats303 commented 10 years ago

The MAACS dataset has no "eno" field as shown in ggplot2 slide (Roger Peng' Exploratory Data Analysis course). Have any ideas why ?

TarekDib03 commented 10 years ago

You may load the whole data set with eno from here:

https://github.com/TarekDib03/ExploratoryDataAnalysisCoursera/blob/master/maacs.Rda

Let me know if you get any questions!

mlstats303 commented 10 years ago

Thanks TarekDib03. It's working now.

starry0731 commented 9 years ago

Thanks TareDib03!

MattLWhitaker commented 9 years ago

Thanks TarekDib03.

TarekDib03 commented 9 years ago

Anytime guys! I am glad I was able to help.

On Wed, Aug 13, 2014 at 5:10 PM, Matt Whitaker notifications@github.com wrote:

Thanks TarekDib03.

— Reply to this email directly or view it on GitHub https://github.com/jtleek/modules/issues/82#issuecomment-52052900.

RyanLeiTaiwan commented 9 years ago

I appreciate your upload. This will help me practice with the ggplot2 videos.

SidharthaRay commented 9 years ago

In the dataset I'm not finding logpm25, NocturnalSympt and also few other attributes..

Tetlanesh commented 9 years ago

I also want to point out that data used in lecture have no corresponding dataset we can use, and data set prowided above does not contain some of the columns that where shown in lecture

majidyousuf commented 9 years ago

Jtleek has used maacs dataset for ggplot2 example.. but above link do not have attributes used in it. attributes like bmicat, NocturnalSympt, logpm25.. do anyone have any idead wr to find it..

bensooraj commented 9 years ago

Do we have any updates here?

duongtrung commented 9 years ago

There are some columns missing in the maacs data set, more specifically those needed for 3 - 7 - ggplot2(part 5)[8_11].mp4 video.

I found the data set in here: https://github.com/TarekDib03/ExploratoryDataAnalysisCoursera/blob/master/maacs.html But I cannot load the data set with the error message: "more columns than column names"

duongtrung commented 9 years ago

try this code to save all columns from the original data set (https://github.com/jtleek/modules/blob/master/04_ExploratoryAnalysis/PlottingLattice/maacs_env.rds): env <- readRDS("maacs_env.rds") id <- 1:750 maacs <- data.frame(id, env) save(maacs, file = "maacs.rda")

zhujianwei31415 commented 9 years ago

Jtleek has used maacs dataset for ggplot2 example.. but above link do not have all attributes used in it. For example, attributes like bmicat, NocturnalSympt, logpm25, and so on. Does anyone have chance to get it?

arevelio commented 8 years ago

Best as I can tell, you can't get the dataset that contains bmicat and NocturnalSympt for this part of of the lecture. I found the quote below from http://lib.psylab.info/files/Peng2015b.pdf

"NOTE: Because the individual-level data for this study are protected by various U.S. privacy laws, we cannot make those data available. For the purposes of this chapter, we have simulated data that share many of the same features of the original data, but do not contain any of the actual measurements or values contained in the original dataset."

If someone finds this dataset, I'd love to recreate the plots in the second part of the lecture.

lupok2001 commented 8 years ago

Use THIS: https://github.com/lupok2001/datasciencecoursera/blob/master/maacs.Rda

The variables are simulated. When run with the R Code from the ggplot2 lectures it gives slightly different graphs, but it works.

anatur commented 8 years ago

thanks lupok2001! it worked for me

juan637 commented 8 years ago

Thanks TarekDib03, your contribution was very useful.

XeenChaos commented 6 years ago

Thank you guys, is a great help to find the data here.....Otherwise cannot practice. Cheers

Francisco-Marquez commented 6 years ago

Thank you @lupok2001 , it works well for me!

stomioka commented 6 years ago

Thanks @lupok2001. workes for me download.file("https://github.com/lupok2001/datasciencecoursera/raw/master/maacs.Rda", dest="./lecture/maacs.Rda",mode="wb") load("./lecture/maacs.Rda")

dmontigny commented 2 years ago

I loaded it with these steps: url = "https://raw.githubusercontent.com/lejarx/MAACS-dataset/master/maacs.rda" destfile = tempfile(fileext = ".rda") download.file(url, destfile, method = 'libcurl', mode = "wb", quiet=TRUE) load(destfile) unlink(destfile)