Closed ajdamico closed 3 years ago
if i copy and paste this script into a console, it breaks.. could you edit the script above and assign back to me? thanks
weird :|
R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R é um software livre e vem sem GARANTIA ALGUMA.
Você pode redistribuí-lo sob certas circunstâncias.
Digite 'license()' ou 'licence()' para detalhes de distribuição.
R é um projeto colaborativo com muitos contribuidores.
Digite 'contributors()' para obter mais informações e
'citation()' para saber como citar o R ou pacotes do R em publicações.
Digite 'demo()' para demonstrações, 'help()' para o sistema on-line de ajuda,
ou 'help.start()' para abrir o sistema de ajuda em HTML no seu navegador.
Digite 'q()' para sair do R.
> library(lodown)
> library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
> library(haven)
>
>
>
> fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true"
> tf <- tempfile()
>
> cachaca( fn , tf , mode = 'wb' )
'https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true'
cached in
'C:/Users/user/AppData/Local/Temp/09a348b959d021e507ae015e7d4c9947.Rcache'
copying to
'C:\Users\user\AppData\Local\Temp\RtmpSctQJ4\file17d445ae3292'
>
> unzip( tf, exdir = tempdir() )
>
> anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) )
> test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") )
> test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") )
> test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") )
> test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )
>
> create_pid <- function( hid , pn ) {
+ first_part <- hid
+ secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
+ paste0( first_part , secnd_part )
+ }
>
> anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) )
> colnames(anthropometry) <- tolower(colnames(anthropometry))
> anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc))
> test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) )
> test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) )
> test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )
>
> mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult)
> mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") )
> mothers <- mothers %>% filter( mresult == 1 )
> # mothers <- mothers %>% filter( s01aq07 >= 18 )
>
> mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")
>
> mothers <- mothers %>%
+ mutate( area = case_when(
+ strata == 1 ~ "metlo" ,
+ gov %in% 5:12 ~ "nmetlo" ) )
>
> mothers %>%
+ group_by(area) %>% summarize( count = n() )
# A tibble: 3 × 2
area count
<chr> <int>
1 metlo 107
2 nmetlo 452
3 <NA> 587
>
> mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )
> # mothers <- mothers %>% mutate( bmi = round( bmi , 2) )
>
> mothers <- mothers %>% mutate( nstat = cut( bmi ,
+ breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
+ #labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE ,
+ right = FALSE ) )
> mothers <- mothers %>% mutate( nstat = cut( bmi ,
+ breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
+ labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE ,
+ right = FALSE ) )
>
> mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )
>
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() )
Source: local data frame [15 x 3]
Groups: area [?]
area nstat result
<chr> <ord> <int>
1 metlo obese iii 8
2 metlo obese ii 13
3 metlo obese i 26
4 metlo overweight 41
5 metlo not overweight 19
6 nmetlo obese iii 19
7 nmetlo obese ii 47
8 nmetlo obese i 99
9 nmetlo overweight 134
10 nmetlo not overweight 153
11 <NA> obese iii 11
12 <NA> obese ii 30
13 <NA> obese i 73
14 <NA> overweight 179
15 <NA> not overweight 294
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>%
+ inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>%
+ mutate( prop = result / total , cs = cumsum(prop) )
Source: local data frame [10 x 6]
Groups: area [2]
area nstat result total prop cs
<chr> <ord> <int> <int> <dbl> <dbl>
1 metlo obese iii 8 107 0.07476636 0.07476636
2 metlo obese ii 13 107 0.12149533 0.19626168
3 metlo obese i 26 107 0.24299065 0.43925234
4 metlo overweight 41 107 0.38317757 0.82242991
5 metlo not overweight 19 107 0.17757009 1.00000000
6 nmetlo obese iii 19 452 0.04203540 0.04203540
7 nmetlo obese ii 47 452 0.10398230 0.14601770
8 nmetlo obese i 99 452 0.21902655 0.36504425
9 nmetlo overweight 134 452 0.29646018 0.66150442
10 nmetlo not overweight 153 452 0.33849558 1.00000000
>
rerun and give your sessionInfo() at the end and assign back to me? thanks
On May 15, 2017 11:01 AM, "Guilherme Jacob" notifications@github.com wrote:
weird :|
R version 3.4.0 (2017-04-21) -- "You Stupid Darkness" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit)
R é um software livre e vem sem GARANTIA ALGUMA. Você pode redistribuí-lo sob certas circunstâncias. Digite 'license()' ou 'licence()' para detalhes de distribuição.
R é um projeto colaborativo com muitos contribuidores. Digite 'contributors()' para obter mais informações e 'citation()' para saber como citar o R ou pacotes do R em publicações.
Digite 'demo()' para demonstrações, 'help()' para o sistema on-line de ajuda, ou 'help.start()' para abrir o sistema de ajuda em HTML no seu navegador. Digite 'q()' para sair do R.
library(lodown) library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
library(haven)
fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true" tf <- tempfile()
cachaca( fn , tf , mode = 'wb' ) 'https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true'
cached in
'C:/Users/user/AppData/Local/Temp/09a348b959d021e507ae015e7d4c9947.Rcache'
copying to
'C:\Users\user\AppData\Local\Temp\RtmpSctQJ4\file17d445ae3292'
unzip( tf, exdir = tempdir() )
anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) ) test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") ) test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") ) test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") ) test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )
create_pid <- function( hid , pn ) {
- first_part <- hid
- secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
- paste0( first_part , secnd_part )
- }
anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) ) colnames(anthropometry) <- tolower(colnames(anthropometry)) anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc)) test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) ) test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) ) test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )
mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult) mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") ) mothers <- mothers %>% filter( mresult == 1 )
mothers <- mothers %>% filter( s01aq07 >= 18 )
mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")
mothers <- mothers %>%
- mutate( area = case_when(
- strata == 1 ~ "metlo" ,
- gov %in% 5:12 ~ "nmetlo" ) )
mothers %>%
- group_by(area) %>% summarize( count = n() )
A tibble: 3 × 2
area count
1 metlo 107 2 nmetlo 452 3 587 mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )
mothers <- mothers %>% mutate( bmi = round( bmi , 2) )
mothers <- mothers %>% mutate( nstat = cut( bmi ,
- breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
- ordered = TRUE ,
- include.lowest = TRUE ,
- right = FALSE ) ) mothers <- mothers %>% mutate( nstat = cut( bmi ,
- breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
- labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
- ordered = TRUE ,
- include.lowest = TRUE ,
- right = FALSE ) )
mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )
mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) Source: local data frame [15 x 3] Groups: area [?]
area nstat result <chr> <ord> <int>
1 metlo obese iii 8 2 metlo obese ii 13 3 metlo obese i 26 4 metlo overweight 41 5 metlo not overweight 19 6 nmetlo obese iii 19 7 nmetlo obese ii 47 8 nmetlo obese i 99 9 nmetlo overweight 134 10 nmetlo not overweight 153 11
obese iii 11 12 obese ii 30 13 obese i 73 14 overweight 179 15 not overweight 294 mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>%
- inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>%
- mutate( prop = result / total , cs = cumsum(prop) ) Source: local data frame [10 x 6] Groups: area [2]
area nstat result total prop cs <chr> <ord> <int> <int> <dbl> <dbl>
1 metlo obese iii 8 107 0.07476636 0.07476636 2 metlo obese ii 13 107 0.12149533 0.19626168 3 metlo obese i 26 107 0.24299065 0.43925234 4 metlo overweight 41 107 0.38317757 0.82242991 5 metlo not overweight 19 107 0.17757009 1.00000000 6 nmetlo obese iii 19 452 0.04203540 0.04203540 7 nmetlo obese ii 47 452 0.10398230 0.14601770 8 nmetlo obese i 99 452 0.21902655 0.36504425 9 nmetlo overweight 134 452 0.29646018 0.66150442 10 nmetlo not overweight 153 452 0.33849558 1.00000000
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DjalmaPessoa/convey/issues/265#issuecomment-301520979, or mute the thread https://github.com/notifications/unsubscribe-auth/AANO5-HK-TAgdBiU2nsdSf3oInOkXAghks5r6HbbgaJpZM4NbQeP .
there you go
R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R é um software livre e vem sem GARANTIA ALGUMA.
Você pode redistribuí-lo sob certas circunstâncias.
Digite 'license()' ou 'licence()' para detalhes de distribuição.
R é um projeto colaborativo com muitos contribuidores.
Digite 'contributors()' para obter mais informações e
'citation()' para saber como citar o R ou pacotes do R em publicações.
Digite 'demo()' para demonstrações, 'help()' para o sistema on-line de ajuda,
ou 'help.start()' para abrir o sistema de ajuda em HTML no seu navegador.
Digite 'q()' para sair do R.
> library(lodown)
> library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
> library(haven)
>
>
>
> fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true"
> tf <- tempfile()
>
> cachaca( fn , tf , mode = 'wb' )
'https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true'
cached in
'C:/Users/gjacob/AppData/Local/Temp/09a348b959d021e507ae015e7d4c9947.Rcache'
copying to
'C:\Users\gjacob\AppData\Local\Temp\RtmpABKMC7\file1664663210a5'
>
> unzip( tf, exdir = tempdir() )
>
> anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) )
> test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") )
> test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") )
> test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") )
> test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )
>
> create_pid <- function( hid , pn ) {
+ first_part <- hid
+ secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
+ paste0( first_part , secnd_part )
+ }
>
> anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) )
> colnames(anthropometry) <- tolower(colnames(anthropometry))
> anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc))
> test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) )
> test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) )
> test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )
>
> mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult)
> mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") )
> mothers <- mothers %>% filter( mresult == 1 )
> # mothers <- mothers %>% filter( s01aq07 >= 18 )
>
> mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")
>
> mothers <- mothers %>%
+ mutate( area = case_when(
+ strata == 1 ~ "metlo" ,
+ gov %in% 5:12 ~ "nmetlo" ) )
>
> mothers %>%
+ group_by(area) %>% summarize( count = n() )
# A tibble: 3 × 2
area count
<chr> <int>
1 metlo 107
2 nmetlo 452
3 <NA> 587
>
> mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )
> # mothers <- mothers %>% mutate( bmi = round( bmi , 2) )
>
> mothers <- mothers %>% mutate( nstat = cut( bmi ,
+ breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
+ #labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE ,
+ right = FALSE ) )
> mothers <- mothers %>% mutate( nstat = cut( bmi ,
+ breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
+ labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE ,
+ right = FALSE ) )
>
> mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )
>
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() )
Source: local data frame [15 x 3]
Groups: area [?]
area nstat result
<chr> <ord> <int>
1 metlo obese iii 8
2 metlo obese ii 13
3 metlo obese i 26
4 metlo overweight 41
5 metlo not overweight 19
6 nmetlo obese iii 19
7 nmetlo obese ii 47
8 nmetlo obese i 99
9 nmetlo overweight 134
10 nmetlo not overweight 153
11 <NA> obese iii 11
12 <NA> obese ii 30
13 <NA> obese i 73
14 <NA> overweight 179
15 <NA> not overweight 294
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>%
+ inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>%
+ mutate( prop = result / total , cs = cumsum(prop) )
Source: local data frame [10 x 6]
Groups: area [2]
area nstat result total prop cs
<chr> <ord> <int> <int> <dbl> <dbl>
1 metlo obese iii 8 107 0.07476636 0.07476636
2 metlo obese ii 13 107 0.12149533 0.19626168
3 metlo obese i 26 107 0.24299065 0.43925234
4 metlo overweight 41 107 0.38317757 0.82242991
5 metlo not overweight 19 107 0.17757009 1.00000000
6 nmetlo obese iii 19 452 0.04203540 0.04203540
7 nmetlo obese ii 47 452 0.10398230 0.14601770
8 nmetlo obese i 99 452 0.21902655 0.36504425
9 nmetlo overweight 134 452 0.29646018 0.66150442
10 nmetlo not overweight 153 452 0.33849558 1.00000000
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8.1 x64 (build 9600)
Matrix products: default
locale:
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bindrcpp_0.1 haven_1.0.0 dplyr_0.5.0.9004 lodown_0.1.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 digest_0.6.12 assertthat_0.2.0 R6_2.2.0
[5] magrittr_1.5 rlang_0.0.0.9018 stringi_1.1.5 tools_3.4.0
[9] stringr_1.2.0 readr_1.1.0 glue_1.0.0 hms_0.3
[13] compiler_3.4.0 pkgconfig_2.0.1 bindr_0.1 tibble_1.3.0
hey, sorry, this one seems near-impossible? next step seems like e-mailing the authors with a description of what you're doing and asking them for just enough code (in any language) to reproduce any of the numbers in table 1? sorry
Not income. Made pointless by c87cde0d66823191c0e050158d0820024a955232
not income for sure..is this for multi-dimensional or svylorenz curve?
https://aura.abdn.ac.uk/bitstream/handle/2164/12086/DP_16_2.pdf?sequence=1&isAllowed=y
I think it was for an ordinal inequality comparisons (dominance). But it is hard to replicate and we would probably be better by using simulations. Also, there's not a lot of study about statistical inference in these application.
Even if we did that, it would be multidimensional and i think it exceeds convey
's focus on income. I'd rather keep this closed until we make a proper decision on if and how to handle multidimensional measures.
Problem is: can you hit the number in page 28 of http://aura.abdn.ac.uk/bitstream/handle/2164/5764/DP_16_2.pdf?sequence=1&isAllowed=y
I think this has something to do with how i calculate BMI or how i cut it.
Thanks!
Problem is: can you hit the number in page 28 of http://aura.abdn.ac.uk/bitstream/handle/2164/5764/DP_16_2.pdf?sequence=1&isAllowed=y
I think this has something to do with how i calculate BMI or how i cut it.
Thanks!