Closed beanumber closed 8 years ago
The error occurs here (https://github.com/beanumber/airlines/blob/master/R/etl_load.R#L146)
The description of what happens makes sense. The init script ran fine, which created all of the table definitions. The error occurred while populating the airports
table, which leaves planes
and weather
empty.
Do you have RCurl
installed? Or how does Windows use the method
argument to download.file()
?
I did not have RCurl installed (or cURL, for that matter). I installed them both and tried again, but the same error occurred.
A quick Google search brought me here. The comments said to try just removing method = "curl", which on a Windows machine results in the default method = "internal" (according to the download.file documentation.
Honestly, I have absolutely no idea what the different methods mean, but when I tried removing method = "curl" from the download.file command in etl_load, the error disappeared. I still got the same warning about the 1608 parsing failures, but all of the flights/weather/etc. loaded into my airlines database as far as I can tell.
I'd be happy to try out other ideas that you might have as well to see how they work on Windows.
See also #30 . The documentation also says:
Note that https:// URLs are not supported by the internal method.
and
curl (http://curl.haxx.se/) is installed on OS X and commonly on Unix-alikes. Windows binaries are available at that URL.
But even if that works, this seems silly. Windows users can't use method="curl"
without installing an external dependency, but Mac users can't access HTTPS URLs with method="internal"
?
@nicholasjhorton @jche Can you both execute:
> options("download.file.method")
$download.file.method
[1] "libcurl"
In any case, the solution may be to use RCurl
instead.
Here's what I get:
options("download.file.method") $download.file.method [1] "curl"
I think this is fixed in the newest version of etl
. @jche do you still get this error?
@beanumber etl_create() works now. I still get the same parsing errors, but all the data gets imported and written to the database.
Running the command to initialize the database causes an error:
The initialization creates all 6 tables in the airlines database. The 6 tables are also given all of the columns (as seen in the Fields column in the output for ���DESCRIBE (tableName)���). The carriers table is populated with 1607 rows (as seen in the output for ���SELECT COUNT(*) FROM carriers���), and the other tables all have 0 rows. I���m not entirely sure how/where the initialization process gets stopped.