ajdamico / convey

variance of distribution measures estimation of survey data
GNU General Public License v3.0
18 stars 7 forks source link

match BMI distributions on pdf page 28 #265

Closed ajdamico closed 3 years ago

ajdamico commented 7 years ago

Problem is: can you hit the number in page 28 of http://aura.abdn.ac.uk/bitstream/handle/2164/5764/DP_16_2.pdf?sequence=1&isAllowed=y

I think this has something to do with how i calculate BMI or how i cut it.

Thanks!

Problem is: can you hit the number in page 28 of http://aura.abdn.ac.uk/bitstream/handle/2164/5764/DP_16_2.pdf?sequence=1&isAllowed=y

I think this has something to do with how i calculate BMI or how i cut it.

Thanks!

library(lodown)
library(dplyr)
library(haven)

fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true"
tf <- tempfile()

cachaca( fn , tf , mode = 'wb' )

unzip( tf, exdir = tempdir() )

anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) )
test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") )
test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") )
test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") )
test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )

create_pid <- function( hid , pn ) {
  first_part <- hid
  secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
  paste0( first_part , secnd_part )
}

anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) )
colnames(anthropometry) <- tolower(colnames(anthropometry))
anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc))
test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) )
test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) )
test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )

mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult)
mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") )
mothers <- mothers %>% filter( mresult == 1 )
# mothers <- mothers %>% filter( s01aq07 >= 18 )

mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")

mothers <- mothers %>% 
  mutate( area = case_when( 
    strata == 1 ~ "metlo" , 
    gov %in% 5:12 ~ "nmetlo" ) )

mothers %>% 
  group_by(area) %>% summarize( count = n() )

mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )
# mothers <- mothers %>% mutate( bmi = round( bmi , 2) )

mothers <- mothers %>% mutate( nstat = cut( bmi , 
    breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
    #labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
    ordered = TRUE ,
    include.lowest = TRUE , 
    right = FALSE ) )
mothers <- mothers %>% mutate( nstat = cut( bmi , 
    breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
    labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
    ordered = TRUE ,
    include.lowest = TRUE , 
    right = FALSE ) )

mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )

mothers %>% group_by( area , nstat ) %>% summarise( result = n() )
mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>% 
  inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>% 
  mutate( prop = result / total , cs = cumsum(prop) )
ajdamico commented 7 years ago

if i copy and paste this script into a console, it breaks.. could you edit the script above and assign back to me? thanks

guilhermejacob commented 7 years ago

weird :|


R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R é um software livre e vem sem GARANTIA ALGUMA.
Você pode redistribuí-lo sob certas circunstâncias.
Digite 'license()' ou 'licence()' para detalhes de distribuição.

R é um projeto colaborativo com muitos contribuidores.
Digite 'contributors()' para obter mais informações e
'citation()' para saber como citar o R ou pacotes do R em publicações.

Digite 'demo()' para demonstrações, 'help()' para o sistema on-line de ajuda,
ou 'help.start()' para abrir o sistema de ajuda em HTML no seu navegador.
Digite 'q()' para sair do R.

> library(lodown)
> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> library(haven)
> 
> 
> 
> fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true"
> tf <- tempfile()
> 
> cachaca( fn , tf , mode = 'wb' )
'https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true'

cached in

'C:/Users/user/AppData/Local/Temp/09a348b959d021e507ae015e7d4c9947.Rcache'

copying to

'C:\Users\user\AppData\Local\Temp\RtmpSctQJ4\file17d445ae3292'

> 
> unzip( tf, exdir = tempdir() )
> 
> anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) )
> test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") )
> test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") )
> test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") )
> test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )
> 
> create_pid <- function( hid , pn ) {
+   first_part <- hid
+   secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
+   paste0( first_part , secnd_part )
+ }
> 
> anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) )
> colnames(anthropometry) <- tolower(colnames(anthropometry))
> anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc))
> test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) )
> test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) )
> test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )
> 
> mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult)
> mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") )
> mothers <- mothers %>% filter( mresult == 1 )
> # mothers <- mothers %>% filter( s01aq07 >= 18 )
> 
> mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")
> 
> mothers <- mothers %>% 
+   mutate( area = case_when( 
+ strata == 1 ~ "metlo" , 
+ gov %in% 5:12 ~ "nmetlo" ) )
> 
> mothers %>% 
+   group_by(area) %>% summarize( count = n() )
# A tibble: 3 × 2
    area count
   <chr> <int>
1  metlo   107
2 nmetlo   452
3   <NA>   587
> 
> mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )
> # mothers <- mothers %>% mutate( bmi = round( bmi , 2) )
> 
> mothers <- mothers %>% mutate( nstat = cut( bmi , 
+ breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
+ #labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE , 
+ right = FALSE ) )
> mothers <- mothers %>% mutate( nstat = cut( bmi , 
+ breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
+ labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE , 
+ right = FALSE ) )
> 
> mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )
> 
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() )
Source: local data frame [15 x 3]
Groups: area [?]

     area          nstat result
    <chr>          <ord>  <int>
1   metlo      obese iii      8
2   metlo       obese ii     13
3   metlo        obese i     26
4   metlo     overweight     41
5   metlo not overweight     19
6  nmetlo      obese iii     19
7  nmetlo       obese ii     47
8  nmetlo        obese i     99
9  nmetlo     overweight    134
10 nmetlo not overweight    153
11   <NA>      obese iii     11
12   <NA>       obese ii     30
13   <NA>        obese i     73
14   <NA>     overweight    179
15   <NA> not overweight    294
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>% 
+   inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>% 
+   mutate( prop = result / total , cs = cumsum(prop) )
Source: local data frame [10 x 6]
Groups: area [2]

     area          nstat result total       prop         cs
    <chr>          <ord>  <int> <int>      <dbl>      <dbl>
1   metlo      obese iii      8   107 0.07476636 0.07476636
2   metlo       obese ii     13   107 0.12149533 0.19626168
3   metlo        obese i     26   107 0.24299065 0.43925234
4   metlo     overweight     41   107 0.38317757 0.82242991
5   metlo not overweight     19   107 0.17757009 1.00000000
6  nmetlo      obese iii     19   452 0.04203540 0.04203540
7  nmetlo       obese ii     47   452 0.10398230 0.14601770
8  nmetlo        obese i     99   452 0.21902655 0.36504425
9  nmetlo     overweight    134   452 0.29646018 0.66150442
10 nmetlo not overweight    153   452 0.33849558 1.00000000
> 
ajdamico commented 7 years ago

rerun and give your sessionInfo() at the end and assign back to me? thanks

On May 15, 2017 11:01 AM, "Guilherme Jacob" notifications@github.com wrote:

weird :|

R version 3.4.0 (2017-04-21) -- "You Stupid Darkness" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit)

R é um software livre e vem sem GARANTIA ALGUMA. Você pode redistribuí-lo sob certas circunstâncias. Digite 'license()' ou 'licence()' para detalhes de distribuição.

R é um projeto colaborativo com muitos contribuidores. Digite 'contributors()' para obter mais informações e 'citation()' para saber como citar o R ou pacotes do R em publicações.

Digite 'demo()' para demonstrações, 'help()' para o sistema on-line de ajuda, ou 'help.start()' para abrir o sistema de ajuda em HTML no seu navegador. Digite 'q()' para sair do R.

library(lodown) library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

library(haven)

fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true" tf <- tempfile()

cachaca( fn , tf , mode = 'wb' ) 'https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true'

cached in

'C:/Users/user/AppData/Local/Temp/09a348b959d021e507ae015e7d4c9947.Rcache'

copying to

'C:\Users\user\AppData\Local\Temp\RtmpSctQJ4\file17d445ae3292'

unzip( tf, exdir = tempdir() )

anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) ) test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") ) test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") ) test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") ) test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )

create_pid <- function( hid , pn ) {

  • first_part <- hid
  • secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
  • paste0( first_part , secnd_part )
  • }

anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) ) colnames(anthropometry) <- tolower(colnames(anthropometry)) anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc)) test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) ) test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) ) test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )

mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult) mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") ) mothers <- mothers %>% filter( mresult == 1 )

mothers <- mothers %>% filter( s01aq07 >= 18 )

mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")

mothers <- mothers %>%

  • mutate( area = case_when(
  • strata == 1 ~ "metlo" ,
  • gov %in% 5:12 ~ "nmetlo" ) )

mothers %>%

  • group_by(area) %>% summarize( count = n() )

    A tibble: 3 × 2

    area count

    1 metlo 107 2 nmetlo 452 3 587

mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )

mothers <- mothers %>% mutate( bmi = round( bmi , 2) )

mothers <- mothers %>% mutate( nstat = cut( bmi ,

  • breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
  • labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,

  • ordered = TRUE ,
  • include.lowest = TRUE ,
  • right = FALSE ) ) mothers <- mothers %>% mutate( nstat = cut( bmi ,
  • breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
  • labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
  • ordered = TRUE ,
  • include.lowest = TRUE ,
  • right = FALSE ) )

mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )

mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) Source: local data frame [15 x 3] Groups: area [?]

 area          nstat result
<chr>          <ord>  <int>

1 metlo obese iii 8 2 metlo obese ii 13 3 metlo obese i 26 4 metlo overweight 41 5 metlo not overweight 19 6 nmetlo obese iii 19 7 nmetlo obese ii 47 8 nmetlo obese i 99 9 nmetlo overweight 134 10 nmetlo not overweight 153 11 obese iii 11 12 obese ii 30 13 obese i 73 14 overweight 179 15 not overweight 294

mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>%

  • inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>%
  • mutate( prop = result / total , cs = cumsum(prop) ) Source: local data frame [10 x 6] Groups: area [2]
 area          nstat result total       prop         cs
<chr>          <ord>  <int> <int>      <dbl>      <dbl>

1 metlo obese iii 8 107 0.07476636 0.07476636 2 metlo obese ii 13 107 0.12149533 0.19626168 3 metlo obese i 26 107 0.24299065 0.43925234 4 metlo overweight 41 107 0.38317757 0.82242991 5 metlo not overweight 19 107 0.17757009 1.00000000 6 nmetlo obese iii 19 452 0.04203540 0.04203540 7 nmetlo obese ii 47 452 0.10398230 0.14601770 8 nmetlo obese i 99 452 0.21902655 0.36504425 9 nmetlo overweight 134 452 0.29646018 0.66150442 10 nmetlo not overweight 153 452 0.33849558 1.00000000

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DjalmaPessoa/convey/issues/265#issuecomment-301520979, or mute the thread https://github.com/notifications/unsubscribe-auth/AANO5-HK-TAgdBiU2nsdSf3oInOkXAghks5r6HbbgaJpZM4NbQeP .

guilhermejacob commented 7 years ago

there you go


R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R é um software livre e vem sem GARANTIA ALGUMA.
Você pode redistribuí-lo sob certas circunstâncias.
Digite 'license()' ou 'licence()' para detalhes de distribuição.

R é um projeto colaborativo com muitos contribuidores.
Digite 'contributors()' para obter mais informações e
'citation()' para saber como citar o R ou pacotes do R em publicações.

Digite 'demo()' para demonstrações, 'help()' para o sistema on-line de ajuda,
ou 'help.start()' para abrir o sistema de ajuda em HTML no seu navegador.
Digite 'q()' para sair do R.

> library(lodown)
> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> library(haven)
> 
> 
> 
> fn <- "https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true"
> tf <- tempfile()
> 
> cachaca( fn , tf , mode = 'wb' )
'https://dataverse.harvard.edu/api/access/datafile/2409658?gbrecs=true'

cached in

'C:/Users/gjacob/AppData/Local/Temp/09a348b959d021e507ae015e7d4c9947.Rcache'

copying to

'C:\Users\gjacob\AppData\Local\Temp\RtmpABKMC7\file1664663210a5'

> 
> unzip( tf, exdir = tempdir() )
> 
> anthropometry <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/anthropometry.dta" ) )
> test0 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s01a0mv3.dta") )
> test1 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a1fv3.dta") )
> test2 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a2fv3.dta") )
> test3 <- read_dta( paste0( tempdir() , "/eihs 1997/EIHS 1997 Household Survey/EIHS Data/stata/s00a3fv3.dta") )
> 
> create_pid <- function( hid , pn ) {
+   first_part <- hid
+   secnd_part <- stringr::str_pad( pn , pad = "0" , width = 2)
+   paste0( first_part , secnd_part )
+ }
> 
> anthropometry[,] <- apply( anthropometry[,] , 2 , function(x) as.numeric(x) )
> colnames(anthropometry) <- tolower(colnames(anthropometry))
> anthropometry$mid<- as.numeric(create_pid(anthropometry$hid , anthropometry$mc))
> test0[,] <- apply( test0[,] , 2 , function(x) as.numeric(x) )
> test1[,] <- apply( test1[,] , 2 , function(x) as.numeric(x) )
> test2[,] <- apply( test2[,] , 2 , function(x) as.numeric(x) )
> 
> mothers <- anthropometry[ !duplicated( anthropometry$mid),] %>% select( hid, psu , mid , gov , mhght , mwght , mresult)
> mothers <- inner_join( mothers , test0 %>% select( pid , s01aq07 ) , by = c("mid" = "pid") )
> mothers <- mothers %>% filter( mresult == 1 )
> # mothers <- mothers %>% filter( s01aq07 >= 18 )
> 
> mothers <- mothers %>% inner_join( . , test1[ , c( "hid" , "strata" , "weight95" , "expand95" )] , "hid")
> 
> mothers <- mothers %>% 
+   mutate( area = case_when( 
+ strata == 1 ~ "metlo" , 
+ gov %in% 5:12 ~ "nmetlo" ) )
> 
> mothers %>% 
+   group_by(area) %>% summarize( count = n() )
# A tibble: 3 × 2
    area count
   <chr> <int>
1  metlo   107
2 nmetlo   452
3   <NA>   587
> 
> mothers <- mothers %>% mutate( bmi = mwght / (mhght/100)^2 )
> # mothers <- mothers %>% mutate( bmi = round( bmi , 2) )
> 
> mothers <- mothers %>% mutate( nstat = cut( bmi , 
+ breaks = c( 0, 18.5 , 25 , 30 , 35 , 40 , Inf ) ,
+ #labels = c( "underweight" , "normal range" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE , 
+ right = FALSE ) )
> mothers <- mothers %>% mutate( nstat = cut( bmi , 
+ breaks = c( 0, 25 , 30 , 35 , 40 , Inf ) ,
+ labels = c( "not overweight" , "overweight" , "obese i" , "obese ii" , "obese iii" ) ,
+ ordered = TRUE ,
+ include.lowest = TRUE , 
+ right = FALSE ) )
> 
> mothers$nstat <- ordered( mothers$nstat , levels = rev( levels(mothers$nstat ) ) )
> 
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() )
Source: local data frame [15 x 3]
Groups: area [?]

     area          nstat result
    <chr>          <ord>  <int>
1   metlo      obese iii      8
2   metlo       obese ii     13
3   metlo        obese i     26
4   metlo     overweight     41
5   metlo not overweight     19
6  nmetlo      obese iii     19
7  nmetlo       obese ii     47
8  nmetlo        obese i     99
9  nmetlo     overweight    134
10 nmetlo not overweight    153
11   <NA>      obese iii     11
12   <NA>       obese ii     30
13   <NA>        obese i     73
14   <NA>     overweight    179
15   <NA> not overweight    294
> mothers %>% group_by( area , nstat ) %>% summarise( result = n() ) %>% 
+   inner_join( . , mothers %>% group_by( area ) %>% summarise( total = n() ) , by = "area" ) %>% 
+   mutate( prop = result / total , cs = cumsum(prop) )
Source: local data frame [10 x 6]
Groups: area [2]

     area          nstat result total       prop         cs
    <chr>          <ord>  <int> <int>      <dbl>      <dbl>
1   metlo      obese iii      8   107 0.07476636 0.07476636
2   metlo       obese ii     13   107 0.12149533 0.19626168
3   metlo        obese i     26   107 0.24299065 0.43925234
4   metlo     overweight     41   107 0.38317757 0.82242991
5   metlo not overweight     19   107 0.17757009 1.00000000
6  nmetlo      obese iii     19   452 0.04203540 0.04203540
7  nmetlo       obese ii     47   452 0.10398230 0.14601770
8  nmetlo        obese i     99   452 0.21902655 0.36504425
9  nmetlo     overweight    134   452 0.29646018 0.66150442
10 nmetlo not overweight    153   452 0.33849558 1.00000000
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8.1 x64 (build 9600)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
[5] LC_TIME=Portuguese_Brazil.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.1     haven_1.0.0      dplyr_0.5.0.9004 lodown_0.1.0    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10     digest_0.6.12    assertthat_0.2.0 R6_2.2.0        
 [5] magrittr_1.5     rlang_0.0.0.9018 stringi_1.1.5    tools_3.4.0     
 [9] stringr_1.2.0    readr_1.1.0      glue_1.0.0       hms_0.3         
[13] compiler_3.4.0   pkgconfig_2.0.1  bindr_0.1        tibble_1.3.0     
ajdamico commented 7 years ago

hey, sorry, this one seems near-impossible? next step seems like e-mailing the authors with a description of what you're doing and asking them for just enough code (in any language) to reproduce any of the numbers in table 1? sorry

guilhermejacob commented 3 years ago

Not income. Made pointless by c87cde0d66823191c0e050158d0820024a955232

ajdamico commented 3 years ago

not income for sure..is this for multi-dimensional or svylorenz curve?

https://aura.abdn.ac.uk/bitstream/handle/2164/12086/DP_16_2.pdf?sequence=1&isAllowed=y

guilhermejacob commented 3 years ago

I think it was for an ordinal inequality comparisons (dominance). But it is hard to replicate and we would probably be better by using simulations. Also, there's not a lot of study about statistical inference in these application.

Even if we did that, it would be multidimensional and i think it exceeds convey's focus on income. I'd rather keep this closed until we make a proper decision on if and how to handle multidimensional measures.