beanumber / airlines

An R package providing access to medium airline flight delay data
21 stars 36 forks source link

problems running minimal vignette #29

Closed nicholasjhorton closed 9 years ago

nicholasjhorton commented 9 years ago

I'm not able to successfully run etl_create: it yields an error message

Error in .local(conn, statement, ...) : could not run statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'flights GROUP BY year' at line 1

nhorton% mysql -u r-user -p -v Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 373 Server version: 5.6.27 MySQL Community Server (GPL)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.

Reading history-file /Users/nhorton/.mysql_history dbsetup.txt

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

beanumber commented 9 years ago

The error seems to be on line 131 of init.mysql. I wonder if you are having trouble creating VIEWs?

Can you try the following:

nhorton% mysql -u r-user -p -v airlines < ~/R/x86_64-pc-linux-gnu-library/3.2/airlines/sql/init.mysql

You may have to modify your path. If you get the same error, the issue is with MySQL. If not, it may be with DBI.

Try also commenting out the CREATE VIEW statement in init.mysql and see if that helps.

nicholasjhorton commented 9 years ago

Note that I have to write:

[dhcp-149:/] nhorton% mysql -u r-user -p -v airlines < /Library/Frameworks/R.framework/Versions/3.2/Resources/library/airlines/sql/init.mysql

Note that you can get the basename from:

system.file(package="airlines")

Did this work?

[dhcp-149:/] nhorton% mysql -u r-user -p -v airlines < /Library/Frameworks/R.framework/Versions/3.2/Resources/library/airlines/sql/init.mysql

Enter password:

DROP TABLE IF EXISTS flights


CREATE TABLE flights (
year smallint(4) DEFAULT NULL, month smallint(2) DEFAULT NULL, day smallint(2) DEFAULT NULL, dep_time smallint(4) DEFAULT NULL, dep_delay smallint(4) DEFAULT NULL, arr_time smallint(4) DEFAULT NULL, arr_delay smallint(4) DEFAULT NULL, carrier varchar(2) NOT NULL DEFAULT '', tailnum varchar(6) DEFAULT NULL, flight smallint(4) DEFAULT NULL, origin varchar(3) NOT NULL DEFAULT '', dest varchar(3) NOT NULL DEFAULT '', air_time smallint(4) DEFAULT NULL, distance smallint(4) DEFAULT NULL, cancelled tinyint(1) DEFAULT NULL, diverted tinyint(1) DEFAULT NULL, KEY Year (year), KEY Date (year, month, day), KEY Origin (origin), KEY Dest (dest), KEY Carrier (carrier), KEY tailNum (tailNum) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PARTITION BY LIST (Year) ( PARTITION p1 VALUES IN (1987), PARTITION p2 VALUES IN (1988), PARTITION p3 VALUES IN (1989), PARTITION p4 VALUES IN (1990), PARTITION p5 VALUES IN (1991), PARTITION p6 VALUES IN (1992), PARTITION p7 VALUES IN (1993), PARTITION p8 VALUES IN (1994), PARTITION p9 VALUES IN (1995), PARTITION p10 VALUES IN (1996), PARTITION p11 VALUES IN (1997), PARTITION p12 VALUES IN (1998), PARTITION p13 VALUES IN (1999), PARTITION p14 VALUES IN (2000), PARTITION p15 VALUES IN (2001), PARTITION p16 VALUES IN (2002), PARTITION p17 VALUES IN (2003), PARTITION p18 VALUES IN (2004), PARTITION p19 VALUES IN (2005), PARTITION p20 VALUES IN (2006), PARTITION p21 VALUES IN (2007), PARTITION p22 VALUES IN (2008), PARTITION p23 VALUES IN (2009), PARTITION p24 VALUES IN (2010), PARTITION p25 VALUES IN (2011), PARTITION p26 VALUES IN (2012), PARTITION p27 VALUES IN (2013), PARTITION p28 VALUES IN (2014), PARTITION p29 VALUES IN (2015)

)


DROP TABLE IF EXISTS carriers


CREATE TABLE carriers ( carrier varchar(7) NOT NULL DEFAULT '', name varchar(255) NOT NULL DEFAULT '', PRIMARY KEY (carrier)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS airports


CREATE TABLE airports ( faa varchar(3) NOT NULL DEFAULT '', name varchar(255), lat decimal(10,7) DEFAULT NULL, lon decimal(10,7) DEFAULT NULL, alt int(11) DEFAULT NULL, tz smallint(4) DEFAULT NULL, dst char(1), city varchar(255) DEFAULT NULL, country varchar(255) DEFAULT NULL, PRIMARY KEY (faa)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS planes


CREATE TABLE planes ( tailnum varchar(6) NOT NULL DEFAULT '', year int(11) DEFAULT NULL, type text, manufacturer text, model text, engines int(11) DEFAULT NULL, seats int(11) DEFAULT NULL, speed int(11) DEFAULT NULL, engine text, PRIMARY KEY (tailnum)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS weather


CREATE TABLE weather ( origin text, year double DEFAULT NULL, month double DEFAULT NULL, day int(11) DEFAULT NULL, hour int(11) DEFAULT NULL, temp double DEFAULT NULL, dewp double DEFAULT NULL, humid double DEFAULT NULL, wind_dir double DEFAULT NULL, wind_speed double DEFAULT NULL, wind_gust double DEFAULT NULL, precip double DEFAULT NULL, pressure double DEFAULT NULL, visib double DEFAULT NULL, date datetime DEFAULT NULL, KEY year_idx (year), KEY date_widx (date)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP VIEW IF EXISTS summary


CREATE VIEW summary AS SELECT year, count(distinct month) as numMonths , sum(1) as numFlights , sum(1) / count(distinct concat(month, '-', day)) as avgFlightsPerDay , count(distinct carrier) as numCarriers , count(distinct origin) as numOrigins FROM flights

GROUP BY year

[dhcp-149:/] nhorton%

On Oct 19, 2015, at 4:01 PM, Ben Baumer notifications@github.com wrote:

The error seems to be on line 131 of init.mysql. I wonder if you are having trouble creating VIEWs?

Can you try the following:

nhorton% mysql -u r-user -p -v airlines < ~/R/x86_64-pc-linux-gnu-library/3.2/airlines/sql/init.mysql

You may have to modify your path. If you get the same error, the issue is with MySQL. If not, it may be with DBI.

Try also commenting out the CREATE VIEW statement in init.mysql and see if that helps.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College Box 2239, 31 Quadrangle Dr Amherst, MA 01002-5000 https://www.amherst.edu/people/facstaff/nhorton

beanumber commented 9 years ago

So did it work?? On Oct 20, 2015 07:08, "Nicholas Horton" notifications@github.com wrote:

Note that I have to write:

[dhcp-149:/] nhorton% mysql -u r-user -p -v airlines < /Library/Frameworks/R.framework/Versions/3.2/Resources/library/airlines/sql/init.mysql

Note that you can get the basename from:

system.file(package="airlines")

Did this work?

[dhcp-149:/] nhorton% mysql -u r-user -p -v airlines < /Library/Frameworks/R.framework/Versions/3.2/Resources/library/airlines/sql/init.mysql

Enter password:

DROP TABLE IF EXISTS flights


CREATE TABLE flights ( year smallint(4) DEFAULT NULL, month smallint(2) DEFAULT NULL, day smallint(2) DEFAULT NULL, dep_time smallint(4) DEFAULT NULL, dep_delay smallint(4) DEFAULT NULL, arr_time smallint(4) DEFAULT NULL, arr_delay smallint(4) DEFAULT NULL, carrier varchar(2) NOT NULL DEFAULT '', tailnum varchar(6) DEFAULT NULL, flight smallint(4) DEFAULT NULL, origin varchar(3) NOT NULL DEFAULT '', dest varchar(3) NOT NULL DEFAULT '', air_time smallint(4) DEFAULT NULL, distance smallint(4) DEFAULT NULL, cancelled tinyint(1) DEFAULT NULL, diverted tinyint(1) DEFAULT NULL, KEY Year (year), KEY Date (year, month, day), KEY Origin (origin), KEY Dest (dest), KEY Carrier (carrier), KEY tailNum (tailNum) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PARTITION BY LIST (Year) ( PARTITION p1 VALUES IN (1987), PARTITION p2 VALUES IN (1988), PARTITION p3 VALUES IN (1989), PARTITION p4 VALUES IN (1990), PARTITION p5 VALUES IN (1991), PARTITION p6 VALUES IN (1992), PARTITION p7 VALUES IN (1993), PARTITION p8 VALUES IN (1994), PARTITION p9 VALUES IN (1995), PARTITION p10 VALUES IN (1996), PARTITION p11 VALUES IN (1997), PARTITION p12 VALUES IN (1998), PARTITION p13 VALUES IN (1999), PARTITION p14 VALUES IN (2000), PARTITION p15 VALUES IN (2001), PARTITION p16 VALUES IN (2002), PARTITION p17 VALUES IN (2003), PARTITION p18 VALUES IN (2004), PARTITION p19 VALUES IN (2005), PARTITION p20 VALUES IN (2006), PARTITION p21 VALUES IN (2007), PARTITION p22 VALUES IN (2008), PARTITION p23 VALUES IN (2009), PARTITION p24 VALUES IN (2010), PARTITION p25 VALUES IN (2011), PARTITION p26 VALUES IN (2012), PARTITION p27 VALUES IN (2013), PARTITION p28 VALUES IN (2014), PARTITION p29 VALUES IN (2015)

)


DROP TABLE IF EXISTS carriers


CREATE TABLE carriers ( carrier varchar(7) NOT NULL DEFAULT '', name varchar(255) NOT NULL DEFAULT '', PRIMARY KEY (carrier)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS airports


CREATE TABLE airports ( faa varchar(3) NOT NULL DEFAULT '', name varchar(255), lat decimal(10,7) DEFAULT NULL, lon decimal(10,7) DEFAULT NULL, alt int(11) DEFAULT NULL, tz smallint(4) DEFAULT NULL, dst char(1), city varchar(255) DEFAULT NULL, country varchar(255) DEFAULT NULL, PRIMARY KEY (faa)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS planes


CREATE TABLE planes ( tailnum varchar(6) NOT NULL DEFAULT '', year int(11) DEFAULT NULL, type text, manufacturer text, model text, engines int(11) DEFAULT NULL, seats int(11) DEFAULT NULL, speed int(11) DEFAULT NULL, engine text, PRIMARY KEY (tailnum)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS weather


CREATE TABLE weather ( origin text, year double DEFAULT NULL, month double DEFAULT NULL, day int(11) DEFAULT NULL, hour int(11) DEFAULT NULL, temp double DEFAULT NULL, dewp double DEFAULT NULL, humid double DEFAULT NULL, wind_dir double DEFAULT NULL, wind_speed double DEFAULT NULL, wind_gust double DEFAULT NULL, precip double DEFAULT NULL, pressure double DEFAULT NULL, visib double DEFAULT NULL, date datetime DEFAULT NULL, KEY year_idx (year), KEY date_widx (date)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP VIEW IF EXISTS summary


CREATE VIEW summary AS SELECT year, count(distinct month) as numMonths , sum(1) as numFlights , sum(1) / count(distinct concat(month, '-', day)) as avgFlightsPerDay , count(distinct carrier) as numCarriers , count(distinct origin) as numOrigins FROM flights

GROUP BY year

[dhcp-149:/] nhorton%

On Oct 19, 2015, at 4:01 PM, Ben Baumer notifications@github.com wrote:

The error seems to be on line 131 of init.mysql. I wonder if you are having trouble creating VIEWs?

Can you try the following:

nhorton% mysql -u r-user -p -v airlines < ~/R/x86_64-pc-linux-gnu-library/3.2/airlines/sql/init.mysql

You may have to modify your path. If you get the same error, the issue is with MySQL. If not, it may be with DBI.

Try also commenting out the CREATE VIEW statement in init.mysql and see if that helps.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College Box 2239, 31 Quadrangle Dr Amherst, MA 01002-5000 https://www.amherst.edu/people/facstaff/nhorton

— Reply to this email directly or view it on GitHub https://github.com/beanumber/airlines/issues/29#issuecomment-149529031.

nicholasjhorton commented 9 years ago

Alas no (nor did it work at Amherst when they installed using these instructions). I wonder if @ameliaMN can try to set up on her machine, or whether you can get access to a Mac to debug.

I'm hoping to work with my students to set up their own databases but don't know if it's a good idea.

Is it possible to add the steps to manually create the tables and import the data for carriers, airports, and airplanes (with a stub for flights?) This might help students to see what's happening behind the scenes in the etl and airlines approach.

Just my $0.02,

Nick

On Oct 20, 2015, at 7:27 AM, Ben Baumer notifications@github.com wrote:

So did it work?? On Oct 20, 2015 07:08, "Nicholas Horton" notifications@github.com wrote:

Note that I have to write:

[dhcp-149:/] nhorton% mysql -u r-user -p -v airlines < /Library/Frameworks/R.framework/Versions/3.2/Resources/library/airlines/sql/init.mysql

Note that you can get the basename from:

system.file(package="airlines")

Did this work?

[dhcp-149:/] nhorton% mysql -u r-user -p -v airlines < /Library/Frameworks/R.framework/Versions/3.2/Resources/library/airlines/sql/init.mysql

Enter password:

DROP TABLE IF EXISTS flights


CREATE TABLE flights ( year smallint(4) DEFAULT NULL, month smallint(2) DEFAULT NULL, day smallint(2) DEFAULT NULL, dep_time smallint(4) DEFAULT NULL, dep_delay smallint(4) DEFAULT NULL, arr_time smallint(4) DEFAULT NULL, arr_delay smallint(4) DEFAULT NULL, carrier varchar(2) NOT NULL DEFAULT '', tailnum varchar(6) DEFAULT NULL, flight smallint(4) DEFAULT NULL, origin varchar(3) NOT NULL DEFAULT '', dest varchar(3) NOT NULL DEFAULT '', air_time smallint(4) DEFAULT NULL, distance smallint(4) DEFAULT NULL, cancelled tinyint(1) DEFAULT NULL, diverted tinyint(1) DEFAULT NULL, KEY Year (year), KEY Date (year, month, day), KEY Origin (origin), KEY Dest (dest), KEY Carrier (carrier), KEY tailNum (tailNum) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PARTITION BY LIST (Year) ( PARTITION p1 VALUES IN (1987), PARTITION p2 VALUES IN (1988), PARTITION p3 VALUES IN (1989), PARTITION p4 VALUES IN (1990), PARTITION p5 VALUES IN (1991), PARTITION p6 VALUES IN (1992), PARTITION p7 VALUES IN (1993), PARTITION p8 VALUES IN (1994), PARTITION p9 VALUES IN (1995), PARTITION p10 VALUES IN (1996), PARTITION p11 VALUES IN (1997), PARTITION p12 VALUES IN (1998), PARTITION p13 VALUES IN (1999), PARTITION p14 VALUES IN (2000), PARTITION p15 VALUES IN (2001), PARTITION p16 VALUES IN (2002), PARTITION p17 VALUES IN (2003), PARTITION p18 VALUES IN (2004), PARTITION p19 VALUES IN (2005), PARTITION p20 VALUES IN (2006), PARTITION p21 VALUES IN (2007), PARTITION p22 VALUES IN (2008), PARTITION p23 VALUES IN (2009), PARTITION p24 VALUES IN (2010), PARTITION p25 VALUES IN (2011), PARTITION p26 VALUES IN (2012), PARTITION p27 VALUES IN (2013), PARTITION p28 VALUES IN (2014), PARTITION p29 VALUES IN (2015)

)


DROP TABLE IF EXISTS carriers


CREATE TABLE carriers ( carrier varchar(7) NOT NULL DEFAULT '', name varchar(255) NOT NULL DEFAULT '', PRIMARY KEY (carrier)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS airports


CREATE TABLE airports ( faa varchar(3) NOT NULL DEFAULT '', name varchar(255), lat decimal(10,7) DEFAULT NULL, lon decimal(10,7) DEFAULT NULL, alt int(11) DEFAULT NULL, tz smallint(4) DEFAULT NULL, dst char(1), city varchar(255) DEFAULT NULL, country varchar(255) DEFAULT NULL, PRIMARY KEY (faa)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS planes


CREATE TABLE planes ( tailnum varchar(6) NOT NULL DEFAULT '', year int(11) DEFAULT NULL, type text, manufacturer text, model text, engines int(11) DEFAULT NULL, seats int(11) DEFAULT NULL, speed int(11) DEFAULT NULL, engine text, PRIMARY KEY (tailnum)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP TABLE IF EXISTS weather


CREATE TABLE weather ( origin text, year double DEFAULT NULL, month double DEFAULT NULL, day int(11) DEFAULT NULL, hour int(11) DEFAULT NULL, temp double DEFAULT NULL, dewp double DEFAULT NULL, humid double DEFAULT NULL, wind_dir double DEFAULT NULL, wind_speed double DEFAULT NULL, wind_gust double DEFAULT NULL, precip double DEFAULT NULL, pressure double DEFAULT NULL, visib double DEFAULT NULL, date datetime DEFAULT NULL, KEY year_idx (year), KEY date_widx (date)

) ENGINE=MyISAM DEFAULT CHARSET=latin1


DROP VIEW IF EXISTS summary


CREATE VIEW summary AS SELECT year, count(distinct month) as numMonths , sum(1) as numFlights , sum(1) / count(distinct concat(month, '-', day)) as avgFlightsPerDay , count(distinct carrier) as numCarriers , count(distinct origin) as numOrigins FROM flights

GROUP BY year

[dhcp-149:/] nhorton%

On Oct 19, 2015, at 4:01 PM, Ben Baumer notifications@github.com wrote:

The error seems to be on line 131 of init.mysql. I wonder if you are having trouble creating VIEWs?

Can you try the following:

nhorton% mysql -u r-user -p -v airlines < ~/R/x86_64-pc-linux-gnu-library/3.2/airlines/sql/init.mysql

You may have to modify your path. If you get the same error, the issue is with MySQL. If not, it may be with DBI.

Try also commenting out the CREATE VIEW statement in init.mysql and see if that helps.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College Box 2239, 31 Quadrangle Dr Amherst, MA 01002-5000 https://www.amherst.edu/people/facstaff/nhorton

— Reply to this email directly or view it on GitHub https://github.com/beanumber/airlines/issues/29#issuecomment-149529031.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College Box 2239, 31 Quadrangle Dr Amherst, MA 01002-5000 https://www.amherst.edu/people/facstaff/nhorton

beanumber commented 9 years ago

The init.mysql script is pretty transparent -- all it does is create the schema. You can have them run that at the command line, where you might get more informative messages.

If you look in etl_init.R you can see the code that will populate the auxillary tables. These functions are not exported. Maybe they should be?

Just be careful that you don't re-initialize the flights table!

nicholasjhorton commented 9 years ago

Can you please confirm that everything is working for you on your machine and on another machine (perhaps Tony can give you access to a Mac somewhere?)

Nick

On Oct 20, 2015, at 5:29 PM, Ben Baumer notifications@github.com wrote:

Closed #29 via e729670.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College Box 2239, 31 Quadrangle Dr Amherst, MA 01002-5000 https://www.amherst.edu/people/facstaff/nhorton