Open Acturio opened 2 years ago
Looking at the last line, it looks like you forgot to specify the bucket on the input to df07.py
AILED, Error: Unable to open file: gs://BUCKETNAME/flights/staging/ch04timecorr. 1656567385.996847/pipeline.pb
thanks, Lak
On Wed, Jun 29, 2022, 11:01 PM Arturo Bringas @.***> wrote:
Hi! i have the next log when i try to run df07.py.
./df07.py --project ${PROJECT} --bucket ${BUCKETNAME} --region us-central1 Correcting timestamps and writing to BigQuery dataset /home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py:2527: BeamDeprecationWarning: options is deprecated since First stable release. References to .options will not be supported temp_location = pcoll.pipeline.options.view_as( /home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery_file_loads.py:1129: BeamDeprecationWarning: options is deprecated since First stable release. References to .options willnot be supported temp_location = p.options.view_as(GoogleCloudOptions).temp_location warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md
ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs/ /2022-06-29_22_36_30-1790320629162913076?project= Traceback (most recent call last): File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 202, in run(project=args['project'], bucket=args['bucket'], region=args['region']) File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 177, in run (events File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/pipeline.py", line 598, in exit self.result.wait_until_finish() File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1673, in wait_until_finish raise DataflowRuntimeException( apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error: Unable to open file: gs://BUCKETNAME/flights/staging/ch04timecorr.1656567385.996847/pipeline.pb.
Any suggest will be appreciated. Thank you
— Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/data-science-on-gcp/issues/151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANJPZ4JFDDNAFPLKFEHNJDVRUZ5DANCNFSM52IC5SFA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
i appreciate the quick response. This is the last log:
act_arturo_b@cloudshell:~/data-science-on-gcp/04_streaming/transform (ds-on-gcp-353305)$ ./df07.py --project ds-on-gcp-353305 --bucket ${BUCKETNAME} --region us-central1
Correcting timestamps and writing to BigQuery dataset
/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py:2527: BeamDeprecationWarning: options is deprecated since First stable release. References to
ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs/
the problem is the same.
any help will be appreciate.
does this bucket exist? Is the bucket in the us-central1 region?
ds-on-gcp-353305-dsongcp
In any case, the pipeline is failing because it is not able to create this file.
Lak
On Wed, Jun 29, 2022 at 11:33 PM Arturo Bringas @.***> wrote:
i appreciate the quick response. This is the last log:
@.***:~/data-science-on-gcp/04_streaming/transform (ds-on-gcp-353305)$ ./df07.py --project ds-on-gcp-353305 --bucket ${BUCKETNAME} --region us-central1 Correcting timestamps and writing to BigQuery dataset /home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py:2527: BeamDeprecationWarning: options is deprecated since First stable release. References to .options will not be supported temp_location = pcoll.pipeline.options.view_as( /home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery_file_loads.py:1129: BeamDeprecationWarning: options is deprecated since First stable release. References to .options will not be supported temp_location = p.options.view_as(GoogleCloudOptions).temp_location warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md
ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs/ /2022-06-29_23_27_32-11374214288357084698?project= Traceback (most recent call last): File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 202, in run(project=args['project'], bucket=args['bucket'], region=args['region']) File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 177, in run (events File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/pipeline.py", line 598, in exit self.result.wait_until_finish() File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1673, in wait_until_finish raise DataflowRuntimeException( apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error: Unable to open file: gs://ds-on-gcp-353305-dsongcp/flights/staging/ch04timecorr.1656570447.957722/pipeline.pb.
the problem is the same.
any help will be appreciate.
— Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/data-science-on-gcp/issues/151#issuecomment-1170822589, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANJPZ2H7VBICRQK6B32FTLVRU5TDANCNFSM52IC5SFA . You are receiving this because you commented.Message ID: @.***>
Hi! i have the next log when i try to run df07.py.
./df07.py --project PROJECT --bucket BUCKETNAME --region us-central1 Correcting timestamps and writing to BigQuery dataset /home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py:2527: BeamDeprecationWarning: options is deprecated since First stable release. References to.options will not be supported
temp_location = pcoll.pipeline.options.view_as(
/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery_file_loads.py:1129: BeamDeprecationWarning: options is deprecated since First stable release. References to .options willnot be supported
temp_location = p.options.view_as(GoogleCloudOptions).temp_location
warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md
ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs//2022-06-29_22_36_30-1790320629162913076?project=
Traceback (most recent call last):
File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 202, in
run(project=args['project'], bucket=args['bucket'], region=args['region'])
File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 177, in run
(events
File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/pipeline.py", line 598, in exit
self.result.wait_until_finish()
File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1673, in wait_until_finish
raise DataflowRuntimeException(
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Unable to open file: gs://BUCKETNAME/flights/staging/ch04timecorr.1656567385.996847/pipeline.pb.
Any suggest will be appreciated. Thank you