tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
186.63k stars 74.35k forks source link

ran out of memory trying to allocate #35264

Closed liuxingbaoyu closed 4 years ago

liuxingbaoyu commented 4 years ago

It will take up more than 30gb of memory, happening in tensorflow, tensorflow-gpu, tf-nightly

Code:

import tensorflow as tf
from tensorflow import keras

(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()

x_train = x_train.reshape((-1,28,28,1))
x_test = x_test.reshape((-1,28,28,1))

y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

input_shape=(224, 224, 3)
inputs=tf.keras.layers.Input(shape=input_shape)

x = tf.keras.layers.Flatten()(inputs)
outputs = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)

model.compile(optimizer=keras.optimizers.Adam(),
             loss="categorical_crossentropy",
              metrics=['accuracy'])

x_train=tf.image.resize(x_train,input_shape[:2])
x_train=tf.image.grayscale_to_rgb(x_train)

x_train=x_train[:128]
y_train=y_train[:128]
model.fit(x=x_train,y=y_train,batch_size=1)

`2019-12-19 22:41:47.467474: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2019-12-19 22:41:52.813348: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2019-12-19 22:41:52.851093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.755 pciBusID: 0000:41:00.0 2019-12-19 22:41:52.851257: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2019-12-19 22:41:52.851712: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2019-12-19 22:41:53.319561: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2019-12-19 22:41:53.323650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.755 pciBusID: 0000:41:00.0 2019-12-19 22:41:53.323795: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2019-12-19 22:41:53.324400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2019-12-19 22:41:53.989669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-12-19 22:41:53.989780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2019-12-19 22:41:53.989838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2019-12-19 22:41:53.990709: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9530 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:41:00.0, compute capability: 7.5) 2019-12-19 22:41:54.080141: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 12042240000 exceeds 10% of system memory. 2019-12-19 22:42:15.504057: W tensorflow/core/common_runtime/bfc_allocator.cc:419] Allocator (GPU_0_bfc) ran out of memory trying to allocate 11.21GiB (rounded to 12042240000). Current allocation summary follows. 2019-12-19 22:42:15.504241: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (256): Total Chunks: 3, Chunks in use: 3. 768B allocated for chunks. 768B in use in bin. 48B client-requested in use in bin. 2019-12-19 22:42:15.504381: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (512): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.504525: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (1024): Total Chunks: 1, Chunks in use: 1. 1.3KiB allocated for chunks. 1.3KiB in use in bin. 1.0KiB client-requested in use in bin. 2019-12-19 22:42:15.504675: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (2048): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.504821: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.504964: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.505108: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (16384): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.505252: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (32768): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.505421: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (65536): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.505631: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (131072): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.505849: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (262144): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.506074: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (524288): Total Chunks: 1, Chunks in use: 0. 1022.0KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.506468: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.506957: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (2097152): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.507273: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (4194304): Total Chunks: 1, Chunks in use: 1. 5.74MiB allocated for chunks. 5.74MiB in use in bin. 5.74MiB client-requested in use in bin. 2019-12-19 22:42:15.507582: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (8388608): Total Chunks: 3, Chunks in use: 0. 26.26MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.507965: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (16777216): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.508284: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.508723: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (67108864): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.520336: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.520521: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-12-19 22:42:15.520749: I tensorflow/core/common_runtime/bfc_allocator.cc:885] Bin for 11.21GiB was 256.00MiB, Chunk State: 2019-12-19 22:42:15.521083: I tensorflow/core/common_runtime/bfc_allocator.cc:898] Next region of size 1048576 2019-12-19 22:42:15.521203: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 000000020FC00000 next 1 of size 1280 2019-12-19 22:42:15.521394: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 000000020FC00500 next 4 of size 256 2019-12-19 22:42:15.521591: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 000000020FC00600 next 7 of size 256 2019-12-19 22:42:15.521787: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 000000020FC00700 next 8 of size 256 2019-12-19 22:42:15.521961: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free at 000000020FC00800 next 18446744073709551615 of size 1046528 2019-12-19 22:42:15.522163: I tensorflow/core/common_runtime/bfc_allocator.cc:898] Next region of size 8388608 2019-12-19 22:42:15.522353: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free at 000000020FE00000 next 18446744073709551615 of size 8388608 2019-12-19 22:42:15.522590: I tensorflow/core/common_runtime/bfc_allocator.cc:898] Next region of size 8388608 2019-12-19 22:42:15.522763: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free at 0000000210600000 next 18446744073709551615 of size 8388608 2019-12-19 22:42:15.523051: I tensorflow/core/common_runtime/bfc_allocator.cc:898] Next region of size 16777216 2019-12-19 22:42:15.523236: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0000000210E00000 next 6 of size 6021120 2019-12-19 22:42:15.523468: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free at 00000002113BE000 next 18446744073709551615 of size 10756096 2019-12-19 22:42:15.523719: I tensorflow/core/common_runtime/bfc_allocator.cc:914] Summary of in-use Chunks by size: 2019-12-19 22:42:15.523923: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 3 Chunks of size 256 totalling 768B 2019-12-19 22:42:15.524051: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 1280 totalling 1.3KiB 2019-12-19 22:42:15.524201: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 6021120 totalling 5.74MiB 2019-12-19 22:42:15.524454: I tensorflow/core/common_runtime/bfc_allocator.cc:921] Sum Total of in-use chunks: 5.74MiB 2019-12-19 22:42:15.524644: I tensorflow/core/common_runtime/bfc_allocator.cc:923] total_region_allocatedbytes: 34603008 memorylimit: 9993660007 available bytes: 9959056999 curr_region_allocationbytes: 33554432 2019-12-19 22:42:15.524974: I tensorflow/core/common_runtime/bfc_allocator.cc:929] Stats: Limit: 9993660007 InUse: 6023168 MaxInUse: 22799872 NumAllocs: 20 MaxAllocSize: 8388608

2019-12-19 22:42:15.525331: W tensorflow/core/common_runtime/bfcallocator.cc:424] *__**__ `

ymodak commented 4 years ago

You may try to use limit gpu memory growth parameter by putting following snippet on top of your code. If using TF 2.X

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

For TF 1.X

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
liuxingbaoyu commented 4 years ago

You may try to use limit gpu memory growth parameter by putting following snippet on top of your code. If using TF 2.X

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

For TF 1.X

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

Thanks, I tried tf.config.experimental.set_memory_growth (tf.config.experimental.list_physical_devices ('GPU') [0], True) but it still gives an error, and the same error is reported in versions other than GPU

liuxingbaoyu commented 4 years ago

When I use the cpu version, it will not crash, but it will fill up all my memory(more than 30gb) and get stuck

liuxingbaoyu commented 4 years ago

Sorry, it's because x_train = tf.image.resize (x_train, input_shape [: 2]) x_train = tf.image.grayscale_to_rgb (x_train) takes up too much memory, thank you very much

dain5832 commented 4 years ago

I'm having the same error. How did you solve it?

liuxingbaoyu commented 4 years ago

I'm having the same error. How did you solve it?

Reduce the number of samples in memory.

dain5832 commented 4 years ago

Umm Can you show me a code?

GauravRajwada commented 4 years ago

You may try to use limit gpu memory growth parameter by putting following snippet on top of your code. If using TF 2.X

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

For TF 1.X

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

no it doesnt rezolve

marciowb commented 3 years ago

I'm using Tensorflow 2.4.x My notebook has a NVIDIA GForce 920M (2GB RAM) and I tried to use set_memory_growth, but it doesn't worked. And I tried to limit memory to 1GB, also doesn't worked. So I limited memory utilization to 1.5GB and it worked.

def limitgpu(maxmem):
    gpus = tf.config.list_physical_devices('GPU')
    if gpus:
        # Restrict TensorFlow to only allocate a fraction of GPU memory
        try:
            for gpu in gpus:
                tf.config.experimental.set_virtual_device_configuration(gpu,
                        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=maxmem)])
        except RuntimeError as e:
            # Virtual devices must be set before GPUs have been initialized
            print(e)

# 1.5GB
limitgpu(1024+512) 
tommydino93 commented 1 year ago

What worked for me (with TF 2.4) was changing the data loading of the tf.data.Dataset. Specifically, I switched from using from_tensor_slices to using from_generator. I am tackling semantic segmentation with volumes of shape 64x64x64. Here's some pseudo code:

input_volumes_list = [...]  # list containing the input volumes that have shape 64x64x64
input_masks_list = [...]  # list containing the corresponding segmentation masks also of shape 64x64x64

# define generator function
def generator_images_and_masks():
    for idx in range(len(input_volumes_list)):
        # extract one image and the corresponding mask
         img = input_volumes_list[idx]
         mask = input_masks_list[idx]

         # convert to TF tensors
         img_tensor = tf.convert_to_tensor(img, dtype=tf.float32)
         mask_tensor = tf.convert_to_tensor(mask, dtype=tf.float32)

         yield img_tensor, mask_tensor

# create dataset using generator function and specifying shapes and dtypes
dataset = tf.data.Dataset.from_generator(generator_images_and_masks, 
                                         output_signature=(tf.TensorSpec(shape=(64, 64, 64), dtype=tf.float32),
                                                           tf.TensorSpec(shape=(64, 64, 64), dtype=tf.float32)))
Havi-muro commented 1 year ago

What worked for me (with TF 2.4) was changing the data loading of the tf.data.Dataset. Specifically, I switched from using from_tensor_slices to using from_generator. I am tackling semantic segmentation with volumes of shape 64x64x64. Here's some pseudo code:

input_volumes_list = [...]  # list containing the input volumes that have shape 64x64x64
input_masks_list = [...]  # list containing the corresponding segmentation masks also of shape 64x64x64

# define generator function
def generator_images_and_masks():
    for idx in range(len(input_volumes_list)):
        # extract one image and the corresponding mask
         img = input_volumes_list[idx]
         mask = input_masks_list[idx]

         # convert to TF tensors
         img_tensor = tf.convert_to_tensor(img, dtype=tf.float32)
         mask_tensor = tf.convert_to_tensor(mask, dtype=tf.float32)

         yield img_tensor, mask_tensor

# create dataset using generator function and specifying shapes and dtypes
dataset = tf.data.Dataset.from_generator(generator_images_and_masks, 
                                         output_signature=(tf.TensorSpec(shape=(64, 64, 64), dtype=tf.float32),
                                                           tf.TensorSpec(shape=(64, 64, 64), dtype=tf.float32)))

is "input_volumes_list" a list to the paths to your volumes? or are they already read?