GoogleCloudPlatform / cloudml-samples

Cloud ML Engine repo. Please visit the new Vertex AI samples repo at https://github.com/GoogleCloudPlatform/vertex-ai-samples
https://cloud.google.com/ai-platform/docs/
Apache License 2.0
1.52k stars 859 forks source link

run-local script in Molecules raises "KeyError" #426

Closed AseiSugiyama closed 5 years ago

AseiSugiyama commented 5 years ago

Describe the bug

At Preprocessing step, './run-local' command raises following error message;

KeyError: u"<PUBCHEM_MMFF94_ENERGY> [while running 'Feature extraction/Count atoms']"

Complete logs are listed below.

What sample is this bug related to?

I cannot find any related issues.

Source code / logs

$ ./run-local
>> Extracting data
$ python data-extractor.py --work-dir /tmp/cloudml-samples/molecules --max-data-files 5
Found 5506 files, using 5
Extracting data files...
Found /tmp/cloudml-samples/molecules/data/00000001_00025000.sdf
Found /tmp/cloudml-samples/molecules/data/00025001_00050000.sdf
Found /tmp/cloudml-samples/molecules/data/00050001_00075000.sdf
Found /tmp/cloudml-samples/molecules/data/00075001_00100000.sdf
Found /tmp/cloudml-samples/molecules/data/00100001_00125000.sdf

>> Preprocessing
$ python preprocess.py --work-dir /tmp/cloudml-samples/molecules
2019-05-31 11:39:59.143779: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "preprocess.py", line 231, in <module>
    work_dir=args.work_dir)
  File "preprocess.py", line 193, in run
    | 'Write transformFn' >> transform_fn_io.WriteTransformFn(work_dir))
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/pipeline.py", line 410, in __exit__
    self.run().wait_until_finish()
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/pipeline.py", line 390, in run
    self.to_runner_api(), self.runner, self._options).run(False)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/pipeline.py", line 403, in run
    return self.runner.run_pipeline(self)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py", line 134, in run_pipeline
    return runner.run_pipeline(pipeline)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 218, in run_pipeline
    return self.run_via_runner_api(pipeline.to_runner_api())
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 221, in run_via_runner_api
    return self.run_stages(*self.create_stages(pipeline_proto))
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 859, in run_stages
    pcoll_buffers, safe_coders).process_bundle.metrics
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 970, in run_stage
    self._progress_frequency).process_bundle(data_input, data_output)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 1174, in process_bundle
    result_future = self._controller.control_handler.push(process_bundle)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 1054, in push
    response = self.worker.do_instruction(request)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 208, in do_instruction
    request.instruction_id)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 230, in process_bundle
    processor.process_bundle(instruction_id)
  File "/Users/asei/Documents/cloudml-samples/molecules/venv/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 289, in process_bundle
    op.start()
  File "apache_beam/runners/worker/operations.py", line 243, in apache_beam.runners.worker.operations.ReadOperation.start
  File "apache_beam/runners/worker/operations.py", line 244, in apache_beam.runners.worker.operations.ReadOperation.start
  File "apache_beam/runners/worker/operations.py", line 253, in apache_beam.runners.worker.operations.ReadOperation.start
  File "apache_beam/runners/worker/operations.py", line 175, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 85, in apache_beam.runners.worker.operations.ConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 403, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 404, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 569, in apache_beam.runners.common.DoFnRunner.receive
  File "apache_beam/runners/common.py", line 577, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 602, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 575, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 352, in apache_beam.runners.common.SimpleInvoker.invoke_process
  File "apache_beam/runners/common.py", line 673, in apache_beam.runners.common._OutputProcessor.process_outputs
  File "apache_beam/runners/worker/operations.py", line 85, in apache_beam.runners.worker.operations.ConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 403, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 404, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 569, in apache_beam.runners.common.DoFnRunner.receive
  File "apache_beam/runners/common.py", line 577, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 618, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 575, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 352, in apache_beam.runners.common.SimpleInvoker.invoke_process
  File "apache_beam/runners/common.py", line 651, in apache_beam.runners.common._OutputProcessor.process_outputs
  File "/Users/asei/Documents/cloudml-samples/molecules/pubchem/pipeline.py", line 152, in process
    label = float(molecule['<PUBCHEM_MMFF94_ENERGY>'][0])
KeyError: u"<PUBCHEM_MMFF94_ENERGY> [while running 'Feature extraction/Count atoms']"

To Reproduce

  1. Follow "Initial setup" in https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/molecules
  2. ./run-local

Expected behavior

$ ./run-local --max-data-files 3
>> Extracting data
$ python data-extractor.py --work-dir /tmp/cloudml-samples/molecules --max-data-files 3
Found 5506 files, using 3
Extracting data files...
Found /tmp/cloudml-samples/molecules/data/00000001_00025000.sdf
Found /tmp/cloudml-samples/molecules/data/00025001_00050000.sdf
Found /tmp/cloudml-samples/molecules/data/00050001_00075000.sdf

>> Preprocessing
$ python preprocess.py --work-dir /tmp/cloudml-samples/molecules
2019-05-31 13:53:46.360907: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:root:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.

>> Training
$ python trainer/task.py --work-dir /tmp/cloudml-samples/molecules
2019-05-31 13:54:48.747390: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:tensorflow:Export includes no default signature!

Model: /tmp/cloudml-samples/molecules/model/export/final/1559278491

>> Batch prediction
$ python predict.py --work-dir /tmp/cloudml-samples/molecules --model-dir /tmp/cloudml-samples/molecules/model/export/final/1559278491 batch --inputs-dir /tmp/cloudml-samples/molecules/data --outputs-dir /tmp/cloudml-samples/molecules/predictions
2019-05-31 13:54:54.111740: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
{"id": 100001, "predictions": [52.53765869140625]}
{"id": 100003, "predictions": [83.4570541381836]}
{"id": 100004, "predictions": [72.45764923095703]}
{"id": 100005, "predictions": [56.75579833984375]}
{"id": 100006, "predictions": [64.58434295654297]}
{"id": 100007, "predictions": [79.28802490234375]}
{"id": 100008, "predictions": [27.300537109375]}
{"id": 100010, "predictions": [33.57035827636719]}
{"id": 100015, "predictions": [35.49451446533203]}
{"id": 100016, "predictions": [33.07876205444336]}

System Information

To obtain the Tensorflow and Tensorflow Transform environment do

$ pip freeze |grep tensorflow
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
tensorflow==1.8.0
tensorflow-transform==0.8.0
$ pip freeze |grep apache-beam
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
apache-beam==2.5.0

Additional context

00000001_00025000.sdf contains a molecule data without <PUBCHEM_MMFF94_ENERGY> tag. However, pipeline.py cannot handle this type data.

nnegrey commented 5 years ago

Hi thanks for filing the issue, let me get someone to look into it. :)

@davidcavazos

Hey David, mind taking a look at this one?

davidcavazos commented 5 years ago

I was able to reproduce this. I'm looking into it

AseiSugiyama commented 5 years ago

@nnegrey @davidcavazos

Hi, thank you for your great work. I'm waiting for merging PR https://github.com/GoogleCloudPlatform/cloudml-samples/pull/427

nnegrey commented 5 years ago

Thanks @davidcavazos.

@AseiSugiyama going to close this now, but please feel free to reopen if the issue persists or open a new issue if you find something else. Thanks for filing this! :)