tensorflow / autograph

Apache License 2.0
51 stars 18 forks source link

WARNING:tensorflow:Entity <function Function._initialize_uninitialized_variables.<locals>.initialize_variables at 0x000001CF6FFB6CA8> could not be transformed and will be executed as-is. #2

Open Victor-99 opened 5 years ago

Victor-99 commented 5 years ago

The attachments contain details of the warning that I encountered while working on some dataset. Kindly review it. In case of bug, fix it.

WARNING:tensorflow:Entity <function Function._initialize_uninitialized_variables..initialize_variables at 0x000001CF6FFB6CA8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Num' WARNING: Entity <function Function._initialize_uninitialized_variables..initialize_variables at 0x000001CF6FFB6CA8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Num'

Screenshot (21) Screenshot (22)

Corentin-pro commented 5 years ago

Same here, I hope this is why TF2 is not working (otherwise I have no idea why on my RTX 2060 super the script is not working while on other machines it is fine).

The script :

import math
import pickle
import os

import numpy as np
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf

def load_data(path: str):
    data = []
    labels = []
    for batch_index in range(1, 6):
        with open(os.path.join(path, 'data_batch_{}'.format(batch_index)), 'rb') as data_file:
            data_dict = pickle.load(data_file, encoding='bytes')
            data.append(np.reshape(data_dict[b'data'], (len(data_dict[b'data']), 3, 32, 32)))
            labels.append(data_dict[b'labels'])
    return np.concatenate(data), np.concatenate(labels)

def main():
    data_path = 'data/cifar-10-batches-py'
    data, labels = load_data(data_path)
    data = np.transpose(data, (0, 2, 3, 1))

    max_epoch = 2
    batch_size = 32

    image_input = tf.keras.Input(shape=(32, 32, 3))

    data_op = tf.cast(image_input, tf.float32) / 255.0
    network_output = tf.keras.layers.Conv2D(8, 5, strides=(2, 2), activation=tf.nn.relu)(data_op)
    network_output = tf.keras.layers.Conv2D(16, 3, strides=(2, 2), activation=tf.nn.relu)(network_output)
    network_output = tf.keras.layers.Flatten()(network_output)
    network_output = tf.keras.layers.Dense(120, activation=tf.nn.relu)(network_output)
    network_output = tf.keras.layers.Dense(60, activation=tf.nn.relu)(network_output)
    network_output = tf.keras.layers.Dense(10)(network_output)

    model = tf.keras.Model(inputs=image_input, outputs=network_output)

    model.compile(
        optimizer=tf.optimizers.Adam(learning_rate=0.001),
        loss=tf.losses.SparseCategoricalCrossentropy())

    history = model.fit(data, labels, batch_size=batch_size, epochs=max_epoch)

if __name__ == '__main__':
    main()

The output :

Train on 50000 samples
Epoch 1/2
WARNING:tensorflow:Entity <function Function._initialize_uninitialized_variables.<locals>.initialize_variables at 0x7f28df337c10> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for keyword: 1, expecting 2
   32/50000 [..............................] - ETA: 33:42Traceback (most recent call last):
  File "tf2_cifar.py", line 49, in <module>
    main()
  File "tf2_cifar.py", line 45, in main
    history = model.fit(data, labels, batch_size=batch_size, epochs=max_epoch)
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/keras/engine/training.py", line 709, in fit
    return func.fit(
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 313, in fit
    training_result = run_one_epoch(
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 123, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 86, in execution_function
    distributed_function(input_fn))
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/def_function.py", line 457, in __call__
    result = self._call(*args, **kwds)
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/def_function.py", line 520, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/function.py", line 1823, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/function.py", line 1137, in _filtered_call
    return self._call_flat(
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/function.py", line 1223, in _call_flat
    flat_outputs = forward_function.call(
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/function.py", line 506, in call
    outputs = execute.execute(
  File "/usr/lib/python3.8/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[node model/conv2d/Conv2D (defined at /usr/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_distributed_function_950]

Function call stack:
distributed_function

I reinstalled driver, CUDA and cudnn, nothing changed (tensorflow 2.0.0)

DragosUr commented 5 years ago

same here, did someone find some answers ?

mdanatg commented 4 years ago

Sorry for the delayed response! These may be two bugs.

The first error that @Victor-99 reported is the same as #1 and can be fixed by running pip install --user gast==0.2.2. Newer versions of TF should fix it.

I'm not sure if the error from @Corentin-pro is the same - I'm investigating a similar issue in https://github.com/tensorflow/tensorflow/issues/34433. If downgrading gast doesn't fix it, could you re-run the snippet with this added to the top: tf.autograph.set_verbosity(3, True) it should give us additional clues to resolve it.

ajaykumarmizzou commented 4 years ago

ouput bug code bug

Is this a bug of tensorflow 2.0?

mdanatg commented 4 years ago

@Capriciousman in your case, there seem to be two issues -

  1. The warning you see indicates a bug, but since you use only built-in Keras components, it should be safe to ignore. I did a quick investigation, but it doesn't seem to reproduce at head. What version of TF were you using?

  2. Your training seems to generate NaNs. That is more likely due to an incorrect configuration in your model. For instance, the output layer has only one unit, but you need 10 - fashion_mnist is a multi-class model. In fact when I try to run your code it gives me an error about that (Received a label value of 9 which is outside the valid range of [0, 1)). Setting units to 10 trains the model correctly. I suspect the error message that I'm seeing was only added in a more recent version, which would explain why you didn't get one.

ajaykumarmizzou commented 4 years ago

@mdanatg I am using the latest version TF 2.0, Ignore would not lead to the solution. Yes, it is generating Nans, there is an issue in the optimization configuration. I have corrected the # classes. I have tried the same code in google colab and it's working fine there.! Hence seems an issue in my tf/keras/optimization configs.

#######code########

import tensorflow as tf print(tf.version)

mnist = tf.keras.datasets.fashion_mnist (training_images, training_labels), (test_images, test_labels) = mnist.load_data() import matplotlib.pyplot as plt plt.imshow(training_images[0]) print(training_labels[0]) print(training_images[0]) training_images = training_images / 255.0 test_images = test_images / 255.0 model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax)]) model.compile(optimizer = tf.compat.v1.keras.optimizers.Adam, #Error is here loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5) model.evaluate(test_images, test_labels)

ajaykumarmizzou commented 4 years ago

@mdanatg I am using the latest version TF 2.0, Ignore would not lead to the solution. Yes, it is generating Nans, there is an issue in the optimization configuration. I have corrected the # classes. I have tried the same code in google colab and it's working fine there.! Hence seems an issue in my tf/keras/optimization configs.

#######code########

import tensorflow as tf print(tf.version)

mnist = tf.keras.datasets.fashion_mnist (training_images, training_labels), (test_images, test_labels) = mnist.load_data() import matplotlib.pyplot as plt plt.imshow(training_images[0]) print(training_labels[0]) print(training_images[0]) training_images = training_images / 255.0 test_images = test_images / 255.0 model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax)]) model.compile(optimizer = tf.compat.v1.keras.optimizers.Adam, #Error is here loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5) model.evaluate(test_images, test_labels)

akhfzl commented 11 months ago

Hi All Is there a solution about this bug ? Because untill now the bug isnt fix for me