ipeaGIT / geobr

Easy access to official spatial data sets of Brazil in R and Python
https://ipeagit.github.io/geobr/
780 stars 117 forks source link

add schools data set from INEP #190

Open rafapereirabr opened 3 years ago

rafapereirabr commented 3 years ago

Data available at http://portal.inep.gov.br/web/guest/dados/catalogo-de-escolas

rafapereirabr commented 3 years ago

This is a rather simple data set to include in the geobr package. The challeng here is in the translation of the columns. So far, this is what I'm proposing, but please feel free to add your suggestions. Perhaps this is something @schmert could give us hand. Carl has helped us many times so that geobr does not get lost in translation

dplyr::select(df,
              abbrev_state = 'UF',
              name_muni = 'Município',
              code_school = 'Código INEP',
              name_school = 'Escola',
              education_level = 'Etapas e Modalidade de Ensino Oferecidas',
              education_others = 'Outras Ofertas Educacionais',
              admin_category = 'Categoria Administrativa',
              address = 'Endereço',
              phone_number = 'Telefone',
              government_level = 'Dependência Administrativa',
              private_school_type = 'Categoria Escola Privada',
              conveniada_governo = 'Conveniada Poder Público',
              regulated_education_counsel = 'Regulamentação pelo Conselho de Educação',
              service_restriction ='Restrição de Atendimento',
              size = 'Porte da Escola',
              urban = 'Localização',
              location = 'Localidade Diferenciada',
              y = 'Latitude',
              x = 'Longitude'
)
schmert commented 3 years ago

Translating administrative jargon is tough. Here are some suggestions and questions to consider.

Carl

dplyr::select(df, abbrev_state = 'UF', name_muni = 'Município', code_school = 'Código INEP', name_school = 'Escola', education_level = 'Etapas e Modalidade de Ensino Oferecidas',

          education_others = 'Outras Ofertas Educacionais', [nontraditional_education ?  Does this mean adult and/or vocational, for ex.?]

          admin_category = 'Categoria Administrativa',
          address = 'Endereço',
          phone_number = 'Telefone',
          government_level = 'Dependência Administrativa',
          private_school_type = 'Categoria Escola Privada',

          conveniada_governo = 'Conveniada Poder Público',  [private_government_partnership ? does this mean PRIVATE RECEIVING GOVT FUNDS?]

          regulated_education_counsel = 'Regulamentação pelo Conselho de Educação', [IS THIS A DOCUMENT #? A YES/NO CATEGORY? In any case it would be Council with an i. ]

          service_restriction ='Restrição de Atendimento', [restricted_services ? for ex., inaccessible to cadeirantes??]

          size = 'Porte da Escola',
          urban = 'Localização', [urban is ok if this is a 1/0 dummy for urban. Otherwise I suggest location_type]

          location = 'Localidade Diferenciada', [vocês que sabem. “location” is very generic. Too generic? ]
          y = 'Latitude',
          x = 'Longitude'

)

From: Rafael H M Pereira notifications@github.com Sent: Thursday, October 8, 2020 15:20 To: ipeaGIT/geobr geobr@noreply.github.com Cc: Carl Schmertmann schmert@admin.fsu.edu; Mention mention@noreply.github.com Subject: Re: [ipeaGIT/geobr] add schools data set from INEP (#190)

This is a rather simple data set to include in the geobr package. The challeng here is in the translation of the columns. So far, this is what I'm proposing, but please feel free to add your suggestions. Perhaps this is something @schmerthttps://urldefense.com/v3/__https:/github.com/schmert__;!!PhOWcWs!m-nevibpviFb3njX_AKQBZqoJusHVcSO9vmMROLv_TJKkFoAubzsr1HYcL0IgnLNoA$ could give us hand. Carl has helped us many times so that geobr does not get lost in translation

dplyr::select(df,

          abbrev_state = 'UF',

          name_muni = 'Município',

          code_school = 'Código INEP',

          name_school = 'Escola',

          education_level = 'Etapas e Modalidade de Ensino Oferecidas',

          education_others = 'Outras Ofertas Educacionais',

          admin_category = 'Categoria Administrativa',

          address = 'Endereço',

          phone_number = 'Telefone',

          government_level = 'Dependência Administrativa',

          private_school_type = 'Categoria Escola Privada',

          conveniada_governo = 'Conveniada Poder Público',

          regulated_education_counsel = 'Regulamentação pelo Conselho de Educação',

          service_restriction ='Restrição de Atendimento',

          size = 'Porte da Escola',

          urban = 'Localização',

          location = 'Localidade Diferenciada',

          y = 'Latitude',

          x = 'Longitude'

)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ipeaGIT/geobr/issues/190*issuecomment-705759696__;Iw!!PhOWcWs!m-nevibpviFb3njX_AKQBZqoJusHVcSO9vmMROLv_TJKkFoAubzsr1HYcL2oWKnGEg$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ABP65HLAWB6EGKL63FES2Z3SJYGG3ANCNFSM4SJDBIOA__;!!PhOWcWs!m-nevibpviFb3njX_AKQBZqoJusHVcSO9vmMROLv_TJKkFoAubzsr1HYcL26KwI3sA$.

rafapereirabr commented 3 years ago

Thanks for your help, Carl. I'll comment on each question below:

I'm not entirely sure what these categories mean tbh. In any case, I belive the column name should be education_level_others to make it clear this is a coplement to the column education_level.

Yes, this means a private institution that receives goverment funds. I like your suggestion to translate this column as private_government_partnership.

This brings info on whether the school is formerly overseen by a a board of the city ducation counsil. So yes, it involves a formal document. The response categories are yes, no and in progress. Thanks for the heads up on the typo.

-urban = 'Localização' The reponse categories are Urbano and Rural, so we can convert it into a dummy. This way we could use location_type to name the next column Localidade Diferenciada.

schmert commented 3 years ago

I wasn’t quite explicit enough: Council, with ci, not se. Also, maybe consider priv_govt_partnership as a slightly less cumbersome name.

Abraços,

Carl

From: Rafael H M Pereira notifications@github.com Sent: Saturday, October 10, 2020 13:24 To: ipeaGIT/geobr geobr@noreply.github.com Cc: Carl Schmertmann schmert@admin.fsu.edu; Mention mention@noreply.github.com Subject: Re: [ipeaGIT/geobr] add schools data set from INEP (#190)

Thanks for your help, Carl. I'll comment on each question below:

I'm not entirely sure what these categories mean tbh. In any case, I belive the column name should be education_level_others to make it clear this is a coplement to the column education_level.

Yes, this means a private institution that receives goverment funds. I like your suggestion to translate this column as private_government_partnership.

This brings info on whether the school is formerly overseen by a a board of the city ducation counsil. So yes, it involves a formal document. The response categories are yes, no and in progress. Thanks for the heads up on the typo.

-urban = 'Localização' The reponse categories are Urbano and Rural, so we can convert it into a dummy. This way we could use location_type to name the next column Localidade Diferenciada.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ipeaGIT/geobr/issues/190*issuecomment-706583549__;Iw!!PhOWcWs!jKIDnZIimow8chuFHyc_IcWjljr5O0A2lSvYerFuSUHNuW7GMv42FJLwcQjhKelXgw$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ABP65HPKUIBYFC5WTM75WA3SKCKEBANCNFSM4SJDBIOA__;!!PhOWcWs!jKIDnZIimow8chuFHyc_IcWjljr5O0A2lSvYerFuSUHNuW7GMv42FJLwcQghtlG7Kg$.

rafapereirabr commented 3 years ago

Thanks again, Carl. The data set has been processed and available in our server. My next push will include the new read_schools() function to the dev version of geobr and close this issue. I'lve also finally included you as a contributor to geobr in the package DESCRIPTION.

# update the dev version with latest features
utils::remove.packages('geobr')
devtools::install_github("ipeaGIT/geobr", subdir = "r-package")

library(geobr)
sc <- read_schools()
JoaoCarabetta commented 3 years ago

Add it to python

lgabs commented 3 years ago

Adding data for schools was a really nice idea! @JoaoCarabetta , I'm new to this package, but from what I've seen here, read_schools is so far only available for R package. Is that right?

JoaoCarabetta commented 3 years ago

Hi @lgabs, we still didn't implement it to the python version. But, we are working on it. If you want, you can open a PR to add it.