Closed ScottWang closed 8 years ago
I think I figure it out. The pip install does not have the extract-postgres module so I check out repo and install the package, then the above error goes away.
~/Downloads/dataduct$ dataduct pipeline validate ~/GitSrc/third_party/dataduct/examples/test_postgres.yaml
[INFO]: Pipeline scheduled to start at 2016-02-11T00:55:00
[ERROR]: Error creating step of class ExtractPostgresStep, step_param {'worker_group': 'testdpwg', 'schedule': <dataduct.pipeline.schedule.Schedule object at 0x1049f3dd0>, 'max_retries': 0, 'sql': 'select id from associations limit 10\n', 's3_data_dir': <dataduct.s3.s3_path.S3Path object at 0x1049f3f90>, 'required_steps': [], 's3_source_dir': <dataduct.s3.s3_path.S3Path object at 0x1049f3f50>, 'resource': <dataduct.pipeline.ec2_resource.Ec2Resource object at 0x1049f3e10>, 'id': 'ExtractPostgresStep0', 's3_log_dir': <dataduct.s3.s3_log_path.S3LogPath object at 0x1049f3fd0>, 'output_path': 's3://discovery-import/datapipeline/test/one.txt'}
Traceback (most recent call last):
File "/usr/local/bin/dataduct", line 5, in
File "/Library/Python/2.7/site-packages/dataduct-0.5.0-py2.7.egg/EGG-INFO/scripts/dataduct", line 337, in main
File "/Library/Python/2.7/site-packages/dataduct-0.5.0-py2.7.egg/EGG-INFO/scripts/dataduct", line 80, in pipeline_actions
File "/Library/Python/2.7/site-packages/dataduct-0.5.0-py2.7.egg/EGG-INFO/scripts/dataduct", line 55, in initialize_etl_objects
File "build/bdist.macosx-10.10-intel/egg/dataduct/etl/etl_actions.py", line 55, in create_pipeline File "build/bdist.macosx-10.10-intel/egg/dataduct/etl/etl_pipeline.py", line 490, in create_steps File "build/bdist.macosx-10.10-intel/egg/dataduct/steps/extract_postgres.py", line 73, in init File "build/bdist.macosx-10.10-intel/egg/dataduct/steps/etl_step.py", line 150, in create_pipeline_object TypeError: init() got an unexpected keyword argument 'sql'
@ScottWang let me take a look at upgrading the pip version and seeing what is going on.
Thank you very much. Do you have any update on this?
Did anyone encounter this problem before? I am just trying to use the example_extract_postgres.yaml and getting the following error.
[INFO]: Pipeline scheduled to start at 2016-02-11T01:00:00 Traceback (most recent call last): File "./dataduct", line 347, in
main()
File "./dataduct", line 337, in main
pipeline_actions(frequency_override=frequency_override, **arg_vars)
File "./dataduct", line 80, in pipeline_actions
frequency_override, backfill):
File "./dataduct", line 55, in initialize_etl_objects
etls.append(create_pipeline(definition))
File "/Users/scotwang/GitSrc/third_party/dataduct/testdataduct/lib/python2.7/site-packages/dataduct/etl/etl_actions.py", line 55, in create_pipeline
etl.create_steps(steps)
File "/Users/scotwang/GitSrc/third_party/dataduct/testdataduct/lib/python2.7/site-packages/dataduct/etl/etl_pipeline.py", line 451, in create_steps
steps_params = process_steps(steps_params)
File "/Users/scotwang/GitSrc/third_party/dataduct/testdataduct/lib/python2.7/site-packages/dataduct/etl/utils.py", line 68, in process_steps
params['step_class'] = STEP_CONFIG[step_type]
KeyError: 'extract-postgres'