jsyoon0823 / GAIN

Codebase for Generative Adversarial Imputation Networks (GAIN) - ICML 2018
365 stars 152 forks source link

My dataset is 203454KB, I can't get the dataset after filling, because my dataset is too big? It gives some mistakes. #25

Closed ghost closed 3 years ago

ghost commented 3 years ago

the mistake is as follows, can you help me? Very thanks!!!!!! esource exhausted: OOM when allocating tensor with shape[12241,16996] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1335, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1320, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1408, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[12241,16996] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node concat}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Sigmoid/_29]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main_letter_spam.py", line 96, in imputed_data, rmse = main(args) File "main_letter_spam.py", line 45, in main imputed_data_x = gain(miss_data_x, gain_parameters) File "/lxt/gain/GAIN/gain.py", line 169, in gain imputed_data = sess.run([G_sample], feed_dict = {X: X_mb, M: M_mb})[0] File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 930, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1153, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1329, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1349, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[12241,16996] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node concat (defined at /lxt/gain/GAIN/gain.py:94) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Sigmoid/_29]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Errors may have originated from an input operation. Input Source operations connected to node concat: Placeholder (defined at /lxt/gain/GAIN/gain.py:59) Placeholder_1 (defined at /lxt/gain/GAIN/gain.py:61)

Original stack trace for 'concat': File "main_letter_spam.py", line 96, in imputed_data, rmse = main(args) File "main_letter_spam.py", line 45, in main imputed_data_x = gain(miss_data_x, gain_parameters) File "/lxt/gain/GAIN/gain.py", line 113, in gain G_sample = generator(X, M) File "/lxt/gain/GAIN/gain.py", line 94, in generator inputs = tf.concat(values = [x, m], axis = 1) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1271, in concat return gen_array_ops.concat_v2(values=values, axis=axis, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1217, in concat_v2 "ConcatV2", values=values, axis=axis, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 800, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3479, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1961, in init

jsyoon0823 commented 3 years ago

Thanks for your interest in our paper.

It seems like your data is too big; thus, you got an OOM (Out of Memory) error. In that case, you need to use the batch training and inference. This means that when you train and do inference, you should divide the entire data into a batch to reduce the memory that is used simultaneously. In that case, you can avoid OOM error.

You can easily search about batch training and inference in Google.

Thanks!

On Fri, Sep 10, 2021 at 7:07 AM lixiaotongv @.***> wrote:

the mistake is as follows, can you help me? Very thanks!!!!!! esource exhausted: OOM when allocating tensor with shape[12241,16996] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1335, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1320, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1408, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[12241,16996] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node concat}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[Sigmoid/_29]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main_letter_spam.py", line 96, in imputed_data, rmse = main(args) File "main_letter_spam.py", line 45, in main imputed_data_x = gain(miss_data_x, gain_parameters) File "/lxt/gain/GAIN/gain.py", line 169, in gain imputed_data = sess.run([G_sample], feed_dict = {X: X_mb, M: M_mb})[0] File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 930, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1153, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1329, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1349, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[12241,16996] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node concat (defined at /lxt/gain/GAIN/gain.py:94) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[Sigmoid/_29]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Errors may have originated from an input operation. Input Source operations connected to node concat: Placeholder (defined at /lxt/gain/GAIN/gain.py:59) Placeholder_1 (defined at /lxt/gain/GAIN/gain.py:61)

Original stack trace for 'concat': File "main_letter_spam.py", line 96, in imputed_data, rmse = main(args) File "main_letter_spam.py", line 45, in main imputed_data_x = gain(miss_data_x, gain_parameters) File "/lxt/gain/GAIN/gain.py", line 113, in gain G_sample = generator(X, M) File "/lxt/gain/GAIN/gain.py", line 94, in generator inputs = tf.concat(values = [x, m], axis = 1) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1271, in concat return gen_array_ops.concat_v2(values=values, axis=axis, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1217, in concat_v2 "ConcatV2", values=values, axis=axis, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 800, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3479, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1961, in init*

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jsyoon0823/GAIN/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJGFND4AUNBKJYEYVYXJKALUBIGI5ANCNFSM5DZNFYDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.