nasaharvest / dora

Domain-agnostic Outlier Ranking Algorithms (DORA) - SMD cross-divisional use case demonstration of AI/ML
MIT License
10 stars 3 forks source link

Add PAE ranking and image directory data loader #39

Closed bdubayah closed 3 years ago

bdubayah commented 3 years ago
bdubayah commented 3 years ago

Hi @hannah-rae and @stevenlujpl , the PAE ranking modules are now finalized and ready for review – I edited the top comment to include all the major changes. Let me know if there's any questions or changes needed!

stevenlujpl commented 3 years ago

@bdubayah Thanks for making the changes, and they all look great to me.

I still have some questions:

  1. If I run the PAE script, will the training process take place on CPU or GPU?
  2. This problem may already be resolved in tensorflow 2. When I used tensorflow 1 a long time ago, tensorflow 1 would hold on to all the available GPU memories (even it doesn't really need) in a machine. This behavior is annoying (especially when using it on a shared machine). If this problem isn't resolved in tensorflow 2, can you please configure tensorflow to use only the memory it needs?
stevenlujpl commented 3 years ago

This is how I limited the GPU memory usage in tensorflow 1.x:

import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)
stevenlujpl commented 3 years ago

Another option is to set the visible GPUs to be a single GPU.

bdubayah commented 3 years ago

Good catch, I didn't know that tensorflow did that - it will run on GPUs if available. Maybe we could by default set it to one gpu with growth on and then add config options to allow for multiple gpus and enabling/disabling growth? Or an option to entirely disable GPUs?