intel / intel-extension-for-tensorflow

Intel® Extension for TensorFlow*
Other
317 stars 40 forks source link

Unable to use progressive memory growth #47

Closed hydra324 closed 9 months ago

hydra324 commented 1 year ago

Howdy,

I'm currently using intel Data center Max 1100 GPU and I want tensorflow to restrict memory usage to progressive by setting the environment variable TF_FORCE_GPU_ALLOW_GROWTH=true as mentioned in their docs. However, when I check memory usage with sysmon, it still allocates almost all of the gpu memory to my process. I realize that this configuration is paltform specific, hence why I would like to know if there is a way to restrict GPU memory usage on intel GPUs? Thank you!

Regards, hydra324

YuningQiu commented 1 year ago

Hi, may I know the version of the TensorFlow that you are using and do you use Intel Optimization for TensorFlow?

YuningQiu commented 1 year ago

Also, did you try tf.config.experimental.set_memory_growth(device, enable) at [https://www.tensorflow.org/api_docs/python/tf/config/experimental/set_memory_growth]?

guizili0 commented 1 year ago

@hydra324 please try to use "ITEX_LIMIT_MEMORY_SIZE_IN_MB"

hydra324 commented 1 year ago

@hydra324 please try to use "ITEX_LIMIT_MEMORY_SIZE_IN_MB"

Thank you! It works. Is there a similar env variable for letting the memory usage increase as needed instead of an upper bound?

guizili0 commented 1 year ago

This not the upper bound, just limit the size per allocation. If you need more money, it will allocate another memory with the limit size if you have enough memory.

hydra324 commented 1 year ago

Thank you, I see that it answers my question on progressive allocation. Can we restrict memory altogether to a certain size?

guizili0 commented 1 year ago

@hydra324 we did not limit the total memory usage, otherwise the workload wound report OOM when the memory usage is up to the limit, the workload still cannot run.