coursera / dataduct

DataPipeline for humans.
Other
252 stars 82 forks source link

Redshift credentials and logging should not be required in config file #233

Open thesamet opened 8 years ago

thesamet commented 8 years ago

I am trying to use this minimal config described here: http://dataduct.readthedocs.org/en/latest/config.html

But it appears that it's insufficient.

dataduct wouldn't start without a logging section and redshift credentials. We're not using RedShift to I needed to pass a fake section like this:

redshift:
  DATABASE_NAME: zzz
  CLUSTER_ID: zzz
  USERNAME: zzz
  PASSWORD: zzz

Exception:

Traceback (most recent call last):
  File "/usr/local/bin/dataduct", line 347, in <module>
    main()
  File "/usr/local/bin/dataduct", line 337, in main
    pipeline_actions(frequency_override=frequency_override, **arg_vars)
  File "/usr/local/bin/dataduct", line 75, in pipeline_actions
    from dataduct.etl import activate_pipeline
  File "/usr/local/lib/python2.7/dist-packages/dataduct/etl/__init__.py", line 1, in <module>
    from .etl_actions import activate_pipeline
  File "/usr/local/lib/python2.7/dist-packages/dataduct/etl/etl_actions.py", line 5, in <module>
    from ..pipeline import Activity
  File "/usr/local/lib/python2.7/dist-packages/dataduct/pipeline/__init__.py", line 13, in <module>
    from .redshift_database import RedshiftDatabase
  File "/usr/local/lib/python2.7/dist-packages/dataduct/pipeline/redshift_database.py", line 12, in <module>
    raise ETLConfigError('Redshift credentials missing from config')
dataduct.utils.exceptions.ETLConfigError: Redshift credentials missing from config
kpx-dev commented 8 years ago

yea similar to our case, we don't need EMR yet. Should make those options optional.

cscetbon commented 7 years ago

Hey guys, any new news about it ?

jspreddy commented 7 years ago

similar problem with postgres config.

>dataduct pipeline visualize test.jpg pipelines/data_transfer_rds_to_redshift.yaml
Traceback (most recent call last):
  File "/usr/local/bin/dataduct", line 4, in <module>
    __import__('pkg_resources').run_script('dataduct==0.5.0', 'dataduct')
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 739, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1501, in run_script
    exec(script_code, namespace, namespace)
  File "/usr/local/lib/python2.7/site-packages/dataduct-0.5.0-py2.7.egg/EGG-INFO/scripts/dataduct", line 347, in <module>

  File "/usr/local/lib/python2.7/site-packages/dataduct-0.5.0-py2.7.egg/EGG-INFO/scripts/dataduct", line 337, in main

  File "/usr/local/lib/python2.7/site-packages/dataduct-0.5.0-py2.7.egg/EGG-INFO/scripts/dataduct", line 75, in pipeline_actions

  File "build/bdist.macosx-10.11-x86_64/egg/dataduct/etl/__init__.py", line 1, in <module>
  File "build/bdist.macosx-10.11-x86_64/egg/dataduct/etl/etl_actions.py", line 5, in <module>
  File "build/bdist.macosx-10.11-x86_64/egg/dataduct/pipeline/__init__.py", line 10, in <module>
  File "build/bdist.macosx-10.11-x86_64/egg/dataduct/pipeline/postgres_database.py", line 12, in <module>
dataduct.utils.exceptions.ETLConfigError: Postgres credentials missing from config
cscetbon commented 7 years ago

This project seems to be dead ...