Closed beanumber closed 8 years ago
I'm beginning to think that instead, an etl
object should just extend a src
object (which can be either src_sql
or potentially, a src_local
-- see #8) object. That way, you can do things like:
airlines <- etl("airlines")
airlines %>%
etl_extract() %>%
etl_transform() %>%
etl_load()
airlines %>%
tbl("flights") %>%
blah
If you specified a db
connection argument to etl()
, then it would use that, but if not, it would use SQLite (or potentially local storage -- see #8).
So rather than making the DB connection specific to etl_load()
, it ties it inextricably to the etl
object. But, if the user doesn't want to set up a connection to MySQL or PostgreSQL, they can still use etl
and remain oblivious to what's happening behind the scenes.
That's essentially what I was going for in #3, except that etl_extract()
and etl_transform()
would extend (return) a list of local data frame(s), and etl_load()
(as well as etl_update()
, etc) would extend (return) a src
object, but package authors could potentially make their own assumptions what etl_extract()
and etl_transform()
return as long as they provide suitable "database methods"
All this being said, it would still be possible to specify the connection in etl()
using my approach in #3. You'd just have to add ...
as an argument to etl()
And I guess related to #8, I think this design makes sense for users that don't want to work with a database at all since they can just do
airlines <- etl("airlines")
airlines %>%
etl_extract() %>%
etl_transform()
and go on there merry-way
OK, I think I get it. I am making some progress with this and will make a commit soon.
Most of this is now implemented in the newapi
branch.
This is implemented in the newapi
branch. An object of class etl
extends an object of class src_sql
, and if you don't specify a DB connection, you get a local SQLite database.
This is what @cpsievert would do.