running machine_learning have an error

tx1994108 commented 5 years ago

Hello, I am running machine_learning and have the following problem：

Traceback (most recent call last): File "/media/tx/87f7fdbb-a4f7-4bc3-bd3c-8ebaa2867088/home/tx/project/human-activity/human_activity_detection/src/machine_learning.py", line 196, in main() File "/media/tx/87f7fdbb-a4f7-4bc3-bd3c-8ebaa2867088/home/tx/project/human-activity/human_activity_detection/src/machine_learning.py", line 192, in main tf.app.run(main=training_and_testing, argv=None) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/media/tx/87f7fdbb-a4f7-4bc3-bd3c-8ebaa2867088/home/tx/project/human-activity/human_activity_detection/src/machine_learning.py", line 179, in training_and_testing tf.estimator.train_and_evaluate(model, train_spec, eval_spec) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 430, in train_and_evaluate executor.run_local() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 609, in run_local hooks=train_hooks) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 711, in _train_model features, labels, model_fn_lib.ModeKeys.TRAIN, self.config) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "/media/tx/87f7fdbb-a4f7-4bc3-bd3c-8ebaa2867088/home/tx/project/human-activity/human_activity_detection/src/machine_learning.py", line 96, in model_fn final_logits = neural_network(features, params) File "/media/tx/87f7fdbb-a4f7-4bc3-bd3c-8ebaa2867088/home/tx/project/human-activity/human_activity_detection/src/machine_learning.py", line 72, in neural_network input_layer = tf.feature_column.input_layer(features, params['feature_columns']) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 220, in input_layer None, default_name='input_layer', values=features.values()): AttributeError: 'BatchDataset' object has no attribute 'values'

I want to ask how to solve，thank you！

liushiyu1994 commented 5 years ago

Sorry I forget to move the feature file to new directory. I have done a new push and please check if it is solved. Thanks!

tx1994108 commented 5 years ago

Sorry I forget to move the feature file to new directory. I have done a new push and please check if it is solved. Thanks!

Thanks for your reply, I still encountered the same problem, and will appear, failed to allocate 1.71G (1839704320 bytes) from device:CUDA_ERROR_OUT_OF_MEMORY

my device is a TITAN XP.

liushiyu1994 commented 5 years ago

Sorry I really don't know the attribution problem. I guess it is because the version of Pandas? My current pandas version is 0.23.3, tensorflow is 1.13.1 and python is 3.6. BTW, for the memory allocation problem, I guess it is due to running too many processes in the parallel mode. Maybe you can try the single mode by comment the "parallel_main()" line and uncomment the "main()" line. Also, remember to delete all model files under nn_model direct when you fail to run it. Hope it helpful!

liushiyu1994 / human_activity_detection

running machine_learning have an error #1