Open schoend opened 5 years ago
library(airlines)
#> Loading required package: etl
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
db <- src_mysql_cnf(dbname = "airlines")
ontime <- etl("airlines", db = db, dir = "~/dumps/airlines")
etl_init(ontime)
#> Running SQL script at /home/bbaumer/R/x86_64-pc-linux-gnu-library/3.4/airlines/sql/init.mysql
#> Parsed with column specification:
#> cols(
#> Code = col_character(),
#> Description = col_character()
#> )
#> Parsed with column specification:
#> cols(
#> X1 = col_double(),
#> X2 = col_character(),
#> X3 = col_character(),
#> X4 = col_character(),
#> X5 = col_character(),
#> X6 = col_character(),
#> X7 = col_double(),
#> X8 = col_double(),
#> X9 = col_double(),
#> X10 = col_double(),
#> X11 = col_character(),
#> X12 = col_character()
#> )
Created on 2019-03-14 by the reprex package (v0.2.1)
Please make sure that you are using the most recent version of airlines
. Your call to etl_init()
seems to expect 14 variables in airports.dat
, but mine only expects 12.
@schoend you might also want to check /home/schoend/dumps/airlines/raw/airports.dat
to make sure it has the right number of rows and columns. Here is what I get:
bbaumer@mini-meerkat:~$ wc ~/dumps/airlines/raw/airports.dat
8107 22136 850313 /home/bbaumer/dumps/airlines/raw/airports.dat
bbaumer@mini-meerkat:~$ head -n 1 ~/dumps/airlines/raw/airports.dat
1,"Goroka","Goroka","Papua New Guinea","GKA","AYGA",-6.081689,145.391881,5282,10,"U","Pacific/Port_Moresby"
@beanumber I am guessing you have an old file cached. The new file that is automatically downloaded has 14 columns, not 12. It also results in a bunch of parsing errors if one loads it with read_csv()
manually.
> airports_tmp = read_csv("~/dumps/airlines/raw/airports.dat")
── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────────────────
cols(
`1` = col_double(),
`Goroka Airport` = col_character(),
Goroka = col_character(),
`Papua New Guinea` = col_character(),
GKA = col_character(),
AYGA = col_character(),
`-6.081689834590001` = col_double(),
`145.391998291` = col_double(),
`5282` = col_double(),
`10` = col_double(),
U = col_character(),
`Pacific/Port_Moresby` = col_character(),
airport = col_character(),
OurAirports = col_character()
)
Warning: 353 parsing failures.
row col expected actual file
6981 10 a double \N '~/dumps/airlines/raw/airports.dat'
6982 10 a double \N '~/dumps/airlines/raw/airports.dat'
6983 10 a double \N '~/dumps/airlines/raw/airports.dat'
6984 10 a double \N '~/dumps/airlines/raw/airports.dat'
6985 10 a double \N '~/dumps/airlines/raw/airports.dat'
.... ... ........ ...... ...................................
See problems(...) for more details.
> dim(airports_tmp)
[1] 7697 14
mysql load of database seems not to work; partial output from attempt ends with:
6916 X10 a double \N '/home/schoend/dumps/airlines/raw/airports.dat' .... ... ........ ...... ...............................................
See problems(...) for more details.
Error: Columns
13
,14
cannot have NA as nameDo not know where to go next. More detail, attached in .docx file shows that init.mysql file creates database with four tables; carriers is populated but not the other three.
Dale Schoenefeld
DatabaseLoadProblemWithAirlines.docx