ajdamico / lodown

locally download and prepare publicly-available microdata
GNU General Public License v3.0
97 stars 47 forks source link

special characters handling in censo escolar #91

Closed guilhermejacob closed 7 years ago

guilhermejacob commented 7 years ago

Some tables break

DBI::dbWriteTable(
            db,
            paste0( this_table_type , "_" , catalog[ i , "year" ] ) ,
            this_data_file ,
            sep = "|" ,
            best.effort = TRUE ,
            lower.case.names = TRUE ,
            append = TRUE ,
            nrow.check = 1000
          )

but work when

        x <- read.csv2( this_data_file , sep = "|" )
        colnames( x ) <-  tolower( colnames( x ) )

        DBI::dbWriteTable( db , paste0( this_table_type , "_" , catalog[ i , "year" ] ) , x , append = TRUE )

This is probably related to characters like ordinal indicators.

PS: matricula tables seems to be ok.

guilhermejacob commented 7 years ago

Reproducible example:

First, build the package with best.effort = FALSE.

Then run:

library(lodown)

catalog <- get_catalog( "censo_escolar" , output_dir = "D:/Censo Escolar" )

catalog <- subset( catalog , year == 2016 )

lodown( "censo_escolar" , catalog = catalog , path_to_7z = normalizePath( "~/7zip/7z.exe" ) )

It should break.

Is it ok, @ajdamico?