quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
https://quic.github.io/aimet-pages/index.html
Other
2.14k stars 382 forks source link

Error while using AIMET Visualization for Quantization for TensorFlow API #2542

Open sandeep1404 opened 1 year ago

sandeep1404 commented 1 year ago

Hi, I am trying to use AIMET Visualization for Quantization for TensorFlow API inside the docker, the api documentation is from https://quic.github.io/aimet-pages/AimetDocs/api_docs/tensorflow_visualization_quantization.html, i followed the same code, but when i call the visualizing_weight_ranges_for_single_layer() function i am getting the following error

Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/dist-packages/aimet_common/utils.py", line 278, in start_bokeh_server
    server.run_until_shutdown()
  File "/usr/local/lib/python3.8/dist-packages/bokeh/server/server.py", line 184, in run_until_shutdown
    self._loop.start()
  File "/usr/local/lib/python3.8/dist-packages/tornado/platform/asyncio.py", line 195, in start
    self.asyncio_loop.run_forever()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 560, in run_forever
    self._check_running()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 552, in _check_running
    raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 visualizing_weight_ranges_for_single_layer()

Cell In[9], line 16, in visualizing_weight_ranges_for_single_layer()
     14 visualization_url = start_bokeh_server_session(port=8989)
     15 # print(visualization_url)
---> 16 plotting_utils.visualize_weight_ranges_single_layer(sess=sess, layer=conv_op, visualization_url=visualization_url)

TypeError: visualize_weight_ranges_single_layer() got an unexpected keyword argument 'visualization_url'

My visualize_weight_ranges_single_layer() is as follows:

def visualizing_weight_ranges_for_single_layer():
    # load a model
    tf.keras.backend.clear_session()
    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
    sess = tf.compat.v1.Session()

    with sess.as_default():
        sess.run(tf.global_variables_initializer())

        # Getting a layer for visualizaing its weight ranges
        conv_op = sess.graph.get_operation_by_name('conv1_conv/Conv2D')
        #conv_op= 'conv1_conv/Conv2D'
        # Starting a Bokeh Server with port number 8001
        visualization_url = start_bokeh_server_session(port=8989)
        # print(visualization_url)
        plotting_utils.visualize_weight_ranges_single_layer(sess=sess, layer=conv_op, visualization_url=visualization_url)
    sess.close()

I am using tensorflow version 2.10.1, I am not sure how to correct this error can anyone kindly look into this issue and help me resolving this. Thank you in advance.

quic-hitameht commented 1 year ago

Please make sure to disable eager execution before using aimet_tensorflow.plotting_utils.visualize_weight_ranges_single_layer API.

import tensorflow as tf
tf.compat.v1.disable_eager_execution()
sandeep1404 commented 1 year ago

Hi @quic-hitameht , I disabled eager execution and ran the code, the error above shows after disabling eager execution mode only, can you please kindly check that. Thanks in advance.

sandeep1404 commented 1 year ago

Hi @quic-hitameht , Here is my code for visualization :

# TF specific imports
import tensorflow as tf
tf.compat.v1.disable_eager_execution()

from tensorflow.keras.applications.resnet50 import ResNet50

# Import for starting Bokeh Server
from aimet_common.utils import start_bokeh_server_session
from aimet_tensorflow import plotting_utils

tf.compat.v1.keras.backend.clear_session()
model_resnet = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
sess = tf.compat.v1.keras.backend.get_session()

with sess.as_default():
    # Getting a layer for visualizaing its weight ranges
    conv_op = sess.graph.get_operation_by_name('conv1_conv/Conv2D')

    # Starting a Bokeh Server with port number 8001
    visualization_url = start_bokeh_server_session(port=8001)
    print(visualization_url)
    results_dir='./aimet_vis'
    plotting_utils.visualize_weight_ranges_single_layer(sess=sess, layer=conv_op, results_dir=results_dir)
sess.close()

I am getting the following output error and logs when i run the code:

2023-11-03 06:12:10.313710: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-03 06:12:10.313862: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-03 06:12:10.313922: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-03 06:12:10.314020: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-03 06:12:10.314116: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-03 06:12:10.314168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1021 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
INFO:bokeh.server.server:Starting Bokeh server version 1.2.0 (running on Tornado 6.3.3)
Process Process-4:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/dist-packages/aimet_common/utils.py", line 278, in start_bokeh_server
    server.run_until_shutdown()
  File "/usr/local/lib/python3.8/dist-packages/bokeh/server/server.py", line 184, in run_until_shutdown
    self._loop.start()
  File "/usr/local/lib/python3.8/dist-packages/tornado/platform/asyncio.py", line 195, in start
    self.asyncio_loop.run_forever()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 560, in run_forever
    self._check_running()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 552, in _check_running
    raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running
INFO:bokeh.io.state:Session output file './aimet_vis/visualize_weight_ranges_single_layer.html' already exists, will be overwritten.
('http://localhost:8001/', <Process name='Process-4' pid=6131 parent=5892 started>)

When i open the link http://localhost:8001/ it says the site cant be reached and also I am getting RuntimeErrror: This event loop is already running as can be seen from logs can you check what is the issue and can suggest any solution, Thank you in advance.