treasure-data / td

CUI Interface
https://toolbelt.treasuredata.com
49 stars 26 forks source link

connector:issue requires long --database and --table arguments #67

Open frsyuki opened 9 years ago

frsyuki commented 9 years ago

but other commands like bulk_import or import uses just <db> <table> arguments. Can't we use this way instead?

mcaramello commented 9 years ago

I vote for that too. I wondered about the same yesterday.

muga commented 9 years ago

Sounds good. As another suggestion, I think that it's better that users can declare configuration file for connector:issue.

in:
    type: s3
    parser:
        type: csv
        ...
out:
    database: mydb
    table: mytable
    time_column: updated_at

Key points:

nahi commented 9 years ago

@muga Thanks, in: and out: should be handled by https://github.com/treasure-data/td/pull/73

'td connector:issue config db table' is easy to implement like the following. But I'm unsure it's needed now because we can't remove --database and --table already.

% git diff lib
diff --git a/lib/td/command/connector.rb b/lib/td/command/connector.rb
index efbf03b..84653fe 100644
--- a/lib/td/command/connector.rb
+++ b/lib/td/command/connector.rb
@@ -109,7 +109,7 @@ module Command
     op.on('-w', '--wait', 'wait for finishing the job', TrueClass) { |b| wait = b }
     op.on('-x', '--exclude', 'do not automatically retrieve the job result', TrueClass) { |b| exclude = b }

-    config_file = op.cmd_parse
+    config_file, database, table = op.cmd_parse

     required('--database', database)
     required('--table', table)
diff --git a/lib/td/command/list.rb b/lib/td/command/list.rb
index 97e7850..f63864e 100644
--- a/lib/td/command/list.rb
+++ b/lib/td/command/list.rb
@@ -335,7 +335,7 @@ module List
   add_list 'connector:guess', %w[config?], 'Run guess to generate connector config file', ['connector:guess td-bulkload.yml', 'connector:guess --access-id s3accessId --access-secret s3AccessKey --source https://s3.amazonaws.com/bucketname/path/prefix --database connector_database --table connector_table']
   add_list 'connector:preview', %w[config], 'Show preview of connector execution', ['connector:preview td-bulkload.yml']

-  add_list 'connector:issue', %w[config], 'Run one time connector execution', ['connector:issue td-bulkload.yml']
+  add_list 'connector:issue', %w[config db? table?], 'Run one time connector execution', ['connector:issue td-bulkload.yml']

   add_list 'connector:list', %w[], 'Show list of connector sessions', ['connector:list']
   add_list 'connector:create', %w[name cron database table config], 'Create new connector session', ['connector:create connector1 "0 * * * *" connector_database connector_table td-bulkload.yml']
frsyuki commented 9 years ago

Using config file would be the best idea as @muga describes. Because config file can be consistent with embulk-output-td, which makes it easy to migrate between TD data connector and local embulk.

However, setting database and table names in out: section needs API changes. This issue will be blocked until it's implemented on service-side.

muga commented 9 years ago

For example, current connector:issue command is executed like:

$ cat config.yml
in:
  type: s3
  bucket: my_bucket
  ...
out:
  mode: replace

$ td connector:issue config.yml --database my_db --table my_table --time-column datetime

The suggestion on this ticket is following:

$ cat config.yml
in:
  type: s3
  bucket: my_bucket
  ...
out:
  database: my_db
  table: my_table
  time_column: datetime
  mode: replace

$ td connector:issue config.yml