crenteriam / importinegi

Paquete de R para descargar y gestionar bases de datos abiertas del INEGI.
9 stars 2 forks source link

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : unable to open DBF file. SOLVED!!! #1

Open AlfCano opened 3 years ago

AlfCano commented 3 years ago

Hola, he corrido estos comandos:

library ("importinegi")
enoe(year = 2019, trimestre = "trim4", integrar = FALSE)

Pero he obtenido esto:

probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/2019trim4_dbf.zip'
Content type 'application/x-zip-compressed' length 26128286 bytes (24.9 MB)
==================================================
downloaded 24.9 MB

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

He revisado en https://rdrr.io/cran/foreign/man/read.dbf.html , pero sólo encontré:

View source: R/dbf.R Description The function reads a DBF file into a data frame, converting character fields to factors, and trying to respect NULL fields. The DBF format is documented but not much adhered to. There is is no guarantee this will read all DBF files.

Muchas gracias por su atención, cualquier guía que me pueda proporcionar es altamente apreciada.

Alfonso

edmartraps-l15l commented 2 years ago

Piggybacking en este issue. Sigue sin ser resuelto.

Villiem commented 2 years ago

Lo arreglé en la rama enoe, el problema se debe a la diferencia de paths entre Windows y sistemas Unix. https://github.com/Villiem/importinegi/tree/enoe

AlfCano commented 2 years ago

Update: OK, I've added "enoe_n2020", instead of only the number of the year :

enoe <- enoe(year = "enoe_n_2020_", trimestre = "trim3", integrar = TRUE)

but I get another error:


probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/enoe_n_2020_trim3_dbf.zip'
Content type 'application/x-zip-compressed' length 22075762 bytes (21.1 MB)
==================================================
downloaded 21.1 MB

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file
AlfCano commented 2 years ago

Lo arreglé en la rama enoe, el problema se debe a la diferencia de paths entre Windows y sistemas Unix. https://github.com/Villiem/importinegi/tree/enoe

Hola, instalé la versión que tienes en el repositorio:

devtools::install_github("Villiem/importinegi",ref = "enoe", force = TRUE)

Pero he obtenido el mismo resultado:

> enoe <- enoe(year = 2020, trimestre = "trim3", integrar = TRUE)
probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/2020trim3_dbf.zip'
Content type 'text/html' length 2263 bytes
==================================================
downloaded 2263 bytes

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file
Además: Warning message:
In utils::unzip(temp.enoe, exdir = zipdir) :
  error 1 al extraer del archivo zip
Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

Y con: > enoe <- enoe(year = "enoe_n_2020_", trimestre = "trim3", integrar = TRUE) Obtengo:

probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/enoe_n_2020_trim3_dbf.zip'
Content type 'application/x-zip-compressed' length 22075762 bytes (21.1 MB)
==================================================
downloaded 21.1 MB

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

Gracias por el trabajo que has hecho.

Villiem commented 2 years ago

Lo arreglé en la rama enoe, el problema se debe a la diferencia de paths entre Windows y sistemas Unix. https://github.com/Villiem/importinegi/tree/enoe

Hola, instalé la versión que tienes en el repositorio:

devtools::install_github("Villiem/importinegi")

Pero he obtenido el mismo resultado:

> enoe <- enoe(year = 2020, trimestre = "trim3", integrar = TRUE)
probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/2020trim3_dbf.zip'
Content type 'text/html' length 2263 bytes
==================================================
downloaded 2263 bytes

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file
Además: Warning message:
In utils::unzip(temp.enoe, exdir = zipdir) :
  error 1 al extraer del archivo zip
Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

Y con: > enoe <- enoe(year = "enoe_n_2020_", trimestre = "trim3", integrar = TRUE) Obtengo:

probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/enoe_n_2020_trim3_dbf.zip'
Content type 'application/x-zip-compressed' length 22075762 bytes (21.1 MB)
==================================================
downloaded 21.1 MB

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

Gracias por el trabajo que has hecho.

Gracias por instalar mi versión, desafortunadamente la versión que estas utilizando no es la mía. Mi versión no utiliza foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE)

De hecho utiliza la función rio la cual te permite importar diferentes extensiones.

Intenta desinstalar el antiguo con remove.packages('importinegi') y después con devtools::install_github("Villiem/importinegi")

AlfCano commented 2 years ago

Hola, gracias por la rápida respuesta... Ya he ejecutado ambas versiones que he encontrado en el repositorio, "master" y "enoe" En la que me has mencionado, he obtenido lo siguiente:

> devtools::install_github("Villiem/importinegi")
Downloading GitHub repo Villiem/importinegi@HEAD
✔  checking for file ‘/tmp/RtmpgEGobv/remotes330db7e9b3e41/Villiem-importinegi-01022c2/DESCRIPTION’ (559ms)
─  preparing ‘importinegi’:
✔  checking DESCRIPTION meta-information
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
   Omitted ‘LazyData’ from DESCRIPTION
─  building ‘importinegi_1.1.3.tar.gz’

Installing package into ‘/home/cano/R/x86_64-pc-linux-gnu-library/4.2’
(as ‘lib’ is unspecified)
* installing *source* package ‘importinegi’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (importinegi)

Y tras cargar el paquete nuevo:

> enoe <- enoe(year = 2020, trimestre = "trim3", integrar = TRUE)
probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/2020trim3_dbf.zip'
Content type 'text/html' length 2263 bytes
==================================================
downloaded 2263 bytes

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file
Además: Warning message:
In utils::unzip(temp.enoe, exdir = zipdir) :
  error 1 al extraer del archivo zip
Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

Pero es un archivo html... Entonces ejecuté:

enoe <- enoe(year = "enoe_n_2020_", trimestre = "trim3", integrar = TRUE)
probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/enoe_n_2020_trim3_dbf.zip'
Content type 'application/x-zip-compressed' length 22075762 bytes (21.1 MB)
==================================================
downloaded 21.1 MB

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

Muchas gracias por tu atención. Saludos Alfonso

Villiem commented 2 years ago

I am unable to reproduce the error

`> library(importinegi)

enoe <- enoe(year = 2020, trimestre = "trim3") probando la URL 'https://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/enoe_n_2020_trim3_dbf.zip' Content type 'application/x-zip-compressed' length 22075762 bytes (21.1 MB)

downloaded 21.1 MB`

And works just fine

Can you please post the output of

importinegi::enoe

and tell me your operating system please?

AlfCano commented 2 years ago

Hello, many thanks for your help!!

This is the result of installing the enoe branch:

> install_github("Villiem/importinegi",ref = "enoe")
Downloading GitHub repo Villiem/importinegi@enoe
✔  checking for file ‘/tmp/RtmpgEGobv/remotes330db3a6c62cd/Villiem-importinegi-c7adeb5/DESCRIPTION’ (772ms)
─  preparing ‘importinegi’:
✔  checking DESCRIPTION meta-information
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
   Omitted ‘LazyData’ from DESCRIPTION
─  building ‘importinegi_1.1.3.tar.gz’

Installing package into ‘/home/cano/R/x86_64-pc-linux-gnu-library/4.2’
(as ‘lib’ is unspecified)
* installing *source* package ‘importinegi’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (importinegi)

After:

> library(importinegi)
> enoe <- enoe(year = 2020, trimestre = "trim3", integrar = TRUE)
probando la URL 'http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/2020trim3_dbf.zip'
Content type 'text/html' length 2263 bytes
==================================================
downloaded 2263 bytes

Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file
Además: Warning message:
In utils::unzip(temp.enoe, exdir = zipdir) :
  error 1 al extraer del archivo zip
Error in foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE) : 
  unable to open DBF file

This is the output you've requested:

> importinegi::enoe
function (year = NA, trimestre = NA, integrar = FALSE) 
{
    if (is.na(year) & is.na(trimestre)) {
        shell.exec("https://www.inegi.org.mx/programas/enoe/15ymas/")
    }
    fformat = "dbf"
    temp.enoe = tempfile()
    zipdir = tempdir()
    url.base = paste0("http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/", 
        year, trimestre, "_", fformat, ".zip")
    utils::download.file(url.base, temp.enoe)
    utils::unzip(temp.enoe, exdir = zipdir)
    list_dataraw = list.files(zipdir, pattern = ".dbf")
    for (i in list_dataraw) {
        Object = foreign::read.dbf(paste0(zipdir, "\\", i), as.is = TRUE)
        assign(paste0("dt.", tools::file_path_sans_ext(i)), Object)
    }
    output = mget(ls(pattern = "dt."))
    if (integrar == TRUE) {
        data.vivienda = get(ls(pattern = "dt\\.viv"))
        data.hogar = get(ls(pattern = "dt\\.hog"))
        data.sdem = get(ls(pattern = "dt\\.sdem"))
        data.coe1 = get(ls(pattern = "dt\\.coe1"))
        data.coe2 = get(ls(pattern = "dt\\.coe2"))
        data.compiled = merge(data.vivienda, data.hogar)
        data.compiled = merge(data.vivienda, data.sdem)
        data.compiled = merge(data.vivienda, data.coe1)
        data.compiled = merge(data.vivienda, data.coe2)
        output = data.compiled
    }
    return(output)
}
<bytecode: 0x5651e7e7e6a0>
<environment: namespace:importinegi>
> 

My OS is: Linux Mint 20.3, Linux: 5.4.0-131-generic

But then... Thanks to your input "I am unable to reproduce the error". I restarted R, and did once more:

library(importinegi)
> importinegi::enoe
function (year = NA, trimestre = NA, integrar = FALSE, formato = "dbf") 
{
    if (is.na(year) & is.na(trimestre)) {
        shell.exec("https://www.inegi.org.mx/programas/enoe/15ymas/")
    }
    temp.enoe = tempfile()
    zipdir = tempdir()
    if (year >= 2020 & !(year == 2020 & trimestre == "trim1")) {
        url.base = paste("https://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/enoe_n", 
            year, trimestre, paste0(formato, ".zip"), sep = "_")
    }
    else {
        url.base = paste0("http://www.inegi.org.mx/contenidos/programas/enoe/15ymas/microdatos/", 
            year, trimestre, "_", formato, ".zip")
    }
    utils::download.file(url.base, temp.enoe)
    utils::unzip(temp.enoe, exdir = zipdir)
    list_dataraw = list.files(zipdir, pattern = paste0(formato, 
        "$"), full.names = T)
    list_names = basename(tools::file_path_sans_ext(list_dataraw))
    output = lapply(list_dataraw, rio::import)
    names(output) = list_names
    if (integrar == TRUE) {
        output = Reduce(function(x, y) merge(x, y, all = TRUE), 
            output)
    }
    return(output)
}
<bytecode: 0x5556ce3a8e70>
<environment: namespace:importinegi>

There is "rio::import" !!! Now, both: enoe <- enoe(year = 2020, trimestre = "trim3") and enoe <- enoe(year = 2020, trimestre = "trim3", integrar= TRUE) Were successful. THANKS!!!!!! I just had to restart R!! after reinstalling your repo. I'm using enoe branch. ¿Should I change to the "master" one ? It is solved!!!

Villiem commented 2 years ago

I'm glad it worked.

¿Should I change to the "master" one ?

At this moment, enoe and master branch are basically the same. If I find any other issue or INEGI decides to change things, I'll make a new branch and push to master. So master is preferred.

Maybe I'll change the trimestre argument to simply a number from 1-4 instead of a string "trim1-4" and add an option to create a time series out of several quarters. But for now it doesn't matter