Closed mgendarme closed 5 years ago
Hi, that's a big change of versions from TF 1.10 to TF 1.13... One major thing that changed is that we're using tf.keras
by default now instead of keras
.
I'd start by seeing what happens when you add
use_implementation("keras")
right after library(keras)
.
Hi Sigrid,
Trying what you said I run into this problem:
> LossFactory <- function(CLASS){
+ if(CLASS == 1){
+ model <- model %>%
+ compile(loss = dice_coef_loss_bce, #dice_coef_loss_bce, #dice_coef_loss_bce_multiClasses, #k_categorical_crossentropy,
+ optimizer = "adam", # used instead of the classical stochastic gradient descent procedure
+ metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)#, # 'accuracy'
+ )
+ } else if(CLASS == 2) {
+ model <- model %>%
+ compile(loss = dice_coef_loss_bce_2Classes, #dice_coef_loss_bce, #dice_coef_loss_bce_multiClasses, #k_categorical_crossentropy,
+ optimizer = "adam", # used instead of the classical stochastic gradient descent procedure
+ metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)#, # 'accuracy'
+ )
+ } else if(CLASS == 3){
+ model <- model %>%
+ compile(loss = dice_coef_loss_bce_3Classes, #dice_coef_loss_bce, #dice_coef_loss_bce_multiClasses, #k_categorical_crossentropy,
+ optimizer = "adam", # used instead of the classical stochastic gradient descent procedure
+ metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)#, # 'accuracy'
+ )
+ }
+ }
> LossFactory(CLASS = CLASS)
Show Traceback
Rerun with Debug
Error in UseMethod("compile") :
no applicable method for 'compile' applied to an object of class "c('tensorflow.python.keras.engine.training.Model', 'tensorflow.python.keras.engine.network.Network', 'tensorflow.python.keras.engine.base_layer.Layer', 'tensorflow.python.training.checkpointable.base.CheckpointableBase', 'python.builtin.object')"
Is it possible that the k_categorical_crossentropy from tf.keras and keras are retrieving such different results? If yes is there a way to just call k_categorical_crossentropy from keras and not tf.keras like we would with e.g keras::functionofchoice?
This might help: Source code of cross entropy from Keras:
def categorical_crossentropy(output, target, from_logits=False):
"""Categorical crossentropy between an output tensor and a target tensor.
# Arguments
output: A tensor resulting from a softmax
(unless `from_logits` is True, in which
case `output` is expected to be the logits).
target: A tensor of the same shape as `output`.
from_logits: Boolean, whether `output` is the
result of a softmax, or is a tensor of logits.
# Returns
Output tensor.
"""
# Note: tf.nn.softmax_cross_entropy_with_logits
# expects logits, Keras expects probabilities.
if not from_logits:
# scale preds so that the class probas of each sample sum to 1
output /= tf.reduce_sum(output,
reduction_indices=len(output.get_shape()) - 1,
keep_dims=True)
# manual computation of crossentropy
epsilon = _to_tensor(_EPSILON, output.dtype.base_dtype)
output = tf.clip_by_value(output, epsilon, 1. - epsilon)
return - tf.reduce_sum(target * tf.log(output),
reduction_indices=len(output.get_shape()) - 1)
else:
return tf.nn.softmax_cross_entropy_with_logits(labels=target,
logits=output)
from tf.keras :
class CategoricalCrossentropy(Loss):
"""Computes categorical cross entropy loss between the `y_true` and `y_pred`.
Usage:
python
cce = tf.keras.losses.CategoricalCrossentropy()
loss = cce(
[[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]],
[[.9, .05, .05], [.5, .89, .6], [.05, .01, .94]])
print('Loss: ', loss.numpy()) # Loss: 0.3239
Usage with tf.keras API:
python
model = keras.models.Model(inputs, outputs)
model.compile('sgd', loss=tf.keras.losses.CategoricalCrossentropy())
Args:
from_logits: Whether `output` is expected to be a logits tensor. By default,
we consider that `output` encodes a probability distribution.
label_smoothing: If greater than `0` then smooth the labels. This option is
currently not supported when `y_pred` is a sparse input (not one-hot).
reduction: Type of `tf.losses.Reduction` to apply to loss. Default value is
`SUM_OVER_BATCH_SIZE`.
name: Optional name for the op.
"""
def __init__(self,
from_logits=False,
label_smoothing=0,
reduction=losses_impl.ReductionV2.SUM_OVER_BATCH_SIZE,
name=None):
super(CategoricalCrossentropy, self).__init__(
reduction=reduction, name=name)
self.from_logits = from_logits
self.label_smoothing = label_smoothing
def call(self, y_true, y_pred):
"""Invokes the `CategoricalCrossentropy` instance.
Args:
y_true: Ground truth values.
y_pred: The predicted values.
Returns:
Categorical cross entropy losses.
"""
y_pred = ops.convert_to_tensor(y_pred)
y_true = ops.convert_to_tensor(y_true)
is_sparse = y_pred.shape != y_true.shape
if is_sparse:
return sparse_categorical_crossentropy(
y_true, y_pred, from_logits=self.from_logits)
else:
y_true = math_ops.cast(y_true, y_pred.dtype)
if self.label_smoothing > 0:
num_classes = math_ops.cast(array_ops.shape(y_true)[1], y_pred.dtype)
smooth_positives = 1.0 - self.label_smoothing
smooth_negatives = self.label_smoothing / num_classes
y_true = y_true * smooth_positives + smooth_negatives
return categorical_crossentropy(
y_true, y_pred, from_logits=self.from_logits)
This second issue is fixed in the github version of Keras. Can you install keras with devtools::install_github("rstudio/keras")
and try again?
Hi Daniel, That's exactly how I installed keras.
I get the error only when running first:
library(keras) use_implementation("keras")
From the source code from tf.keras and keras what I understand is that keras does return scaled predictions:
scale preds so that the class probas of each sample sum to 1
whereas tf.keras doesn't do that.
Is there an easy way to achieve this ?
@dfalbel I tried again to reinstall keras as ou mentionned (though that's how I did it) and nothing did change.
Here is another example of the same problem that wasn't answered.
Any clue how to solve this?
Thanks in advance
Hm, you use only k_categorical_crossentropy
directly in your code right? (As opposed to that crossvalidated issue which contrasts the function k_categorical_crossentropy
with using the string 'categorical_crossentropy'
in a compile
statement)?
I'm asking because I see recent changes in the corresponding loss
class (https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L328)
def __init__(self,
from_logits=False,
label_smoothing=0,
reduction=losses_impl.ReductionV2.SUM_OVER_BATCH_SIZE,
name=None):
but the implementations of the functions look more or less identical to me between tf.keras
and keras
, as of today:
In any case, this should just be a matter of scaling, and not alter the behavior too much. Looking at your loss curves above, it pretty much looks like there were changes in the optimizer behavior and now, your learning rate is too high. I'd try a lower learning rate, and / or learning rate decay (I'd experiment with different values and see what happens).
Hm, you use only k_categorical_crossentropy directly in your code right? Yes it looks like this:
model <- model %>% compile( loss = dice_coef_loss_bce_3Classes, optimizer = "adam", metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce) )
with
dice_coef_loss_bce_3Classes <- function(y_true, y_pred, l_b_c = L_B_C, w_class_1 = W_CLASS_1, w_class_2 = W_CLASS_2, w_class_3 = W_CLASS_3){
k_categorical_crossentropy(y_true, y_pred) l_b_c + dice_coef_loss_for_bce(y_true[,,,1], y_pred[,,,1]) w_class_1 + dice_coef_loss_for_bce(y_true[,,,2], y_pred[,,,2]) w_class_2 + dice_coef_loss_for_bce(y_true[,,,3], y_pred[,,,3]) w_class_3 } attr(dice_coef_loss_bce_3Classes, "py_function_name") <- "dice_coef_loss_bce_3Classes"
The weights were for the testing:
l_b_c = .7 w_class_1 = .1 w_class_2 = .1 w_class_3 = .1
>In any case, this should just be a matter of scaling, and not alter the behavior too much. Looking at your loss curves above, it pretty much looks like there were changes in the optimizer behavior and now, your learning rate is too high. I'd try a lower learning rate, and / or learning rate decay (I'd experiment with different values and see what happens).
I will try to play with the learning rate and the learning rate decay. Thanks for the advise. The only thing that I do not understand is that I did not change the parameters of the optimizer (in fact I used solely ` optimizer = "adam"` as opposed to `optimizer_adam()`)
Is it possible that the default settings of the optimizer have changed with the upgrade from keras to the tf.keras backend? The two graphs I showed are coming from the same model ran with the "old" and "new" version of keras/tensorflow. What I mean with that is that my model was already optimized to some extent and provided good results which were completely lost after update. I did not have to tweak the learning rate before. Any thoughts on that?
Thanks for your help.
Is it possible that the default settings of the optimizer have changed with the upgrade from keras to the tf.keras backend?
Are you using the current release (1.13) or the nightly build? If 1.13 I would have thought the behavior should not have changed ; however in the nightly, it seems like the v2
versions of the optimizers are now being used by default: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/optimizer_v2/adam.py
By the way, if you are not using the nightly, it might be interesting to test the behavior with it....:
install_tensorflow(version="nightly")
Are you using the current release (1.13) or the nightly build? If 1.13 I would have thought the behavior should not have changed ; however in the nightly, it seems like the v2 versions of the optimizers are now being used by default: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/optimizer_v2/adam.py
This is not very re-insuring as I am not using the nightly version, which leaves me without an explanation for why the model before/after update performs so differently.
May be something that I could mention is that I have installed keras like this:
conda create -y --name r-tensorflow tensorflow-gpu python=3.6.8
It was the only way to have keras run on the GPU for me (solution posted by @dfalbel at some point)
you could test the nightly using
conda create -y --name r-tensorflow tf-nightly-gpu python=3.6.8
Otherwise I can only suggest trying learning rate decay and/or a lower learning rate overall...
I tried to upgrade tensorflow to the nightly version within R and got this error:
install_tensorflow(version="nightly-gpu")
## Package Plan ##
environment location: /home/gendarme/anaconda3/envs/r-tensorflow
The following packages will be REMOVED:
ca-certificates-2019.1.23-0
certifi-2019.3.9-py36_0
libedit-3.1.20181209-hc058e9b_0
libffi-3.2.1-hd88cf55_4
libgcc-ng-8.2.0-hdf63c60_1
libstdcxx-ng-8.2.0-hdf63c60_1
ncurses-6.1-he6710b0_1
openssl-1.1.1b-h7b6447c_1
pip-19.1.1-py36_0
python-3.6.8-h0371630_0
readline-7.0-h7b6447c_5
setuptools-41.0.1-py36_0
sqlite-3.28.0-h7b6447c_0
tk-8.6.8-hbc83047_0
wheel-0.33.4-py36_0
xz-5.2.4-h14c3975_4
zlib-1.2.11-h7b6447c_3
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Remove all packages in environment /home/gendarme/anaconda3/envs/r-tensorflow:
Creating r-tensorflow conda environment for TensorFlow installation...
Collecting package metadata: ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: /home/gendarme/anaconda3/envs/r-tensorflow
added / updated specs:
- python=3.6
The following NEW packages will be INSTALLED:
ca-certificates pkgs/main/linux-64::ca-certificates-2019.1.23-0
certifi pkgs/main/linux-64::certifi-2019.3.9-py36_0
libedit pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
libffi pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
libgcc-ng pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
ncurses pkgs/main/linux-64::ncurses-6.1-he6710b0_1
openssl pkgs/main/linux-64::openssl-1.1.1b-h7b6447c_1
pip pkgs/main/linux-64::pip-19.1.1-py36_0
python pkgs/main/linux-64::python-3.6.8-h0371630_0
readline pkgs/main/linux-64::readline-7.0-h7b6447c_5
setuptools pkgs/main/linux-64::setuptools-41.0.1-py36_0
sqlite pkgs/main/linux-64::sqlite-3.28.0-h7b6447c_0
tk pkgs/main/linux-64::tk-8.6.8-hbc83047_0
wheel pkgs/main/linux-64::wheel-0.33.4-py36_0
xz pkgs/main/linux-64::xz-5.2.4-h14c3975_4
zlib pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use:
# > conda activate r-tensorflow
#
# To deactivate an active environment, use:
# > conda deactivate
#
Installing TensorFlow...
Collecting tensorflow-gpu==nightly from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
ERROR: HTTP error 404 while getting https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
ERROR: Could not install requirement tensorflow-gpu==nightly from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl because of error 404 Client Error: Not Found for url: https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
ERROR: Could not install requirement tensorflow-gpu==nightly from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl because of HTTP error 404 Client Error: Not Found for url: https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl for URL https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
Error: Error 1 occurred installing packages into conda environment r-tensorflow
Also when trying to install it with conda i get this:
conda create -y --name r-tensorflow tf-nightly-gpu python=3.6.8
Collecting package metadata: done
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- tf-nightly-gpu
Current channels:
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/linux-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
In the mean time I am experimenting with the learning rate and the learning rate decay. It seems my model is constently "over"-predicting my second class for some reason (though that was not an issue before). So no solution yet at the horizon.
Is there anything we can do to help here? Otherwise I'd go on and close the issue.
Thanks for your help and sorry for the delay in my response.
So far it seems that by tweaking the parameters of the optimizer I am getting results closer to before but not identical. Also it seems to be extremely sensitive and I did not find an optimal way for identify the best parameters so far. It seems from my testing that there is only a very limited range of values resulting in a proper result.
On the side note I am still having an over-prediction of my second class in most of the cases that I cannot really get rid off (though class 1 and 3 are predicted very accurately). In case you have any thoughts on this I would be happy to read them.
Dear all, End of last year i was playing with some U-Net (example of my last post). My model did retrieve loss curves like this:
I have updated R and all the libraries, my session now looks like this:
The behaviour of the loss over iterations looks like this:
The code did not change only upgrading R and running it on a local GPU instead of a CPU. (I tried also the same model with the CPU installation and it wasn't better)
Any idea where this is coming from? I am suspecting a change in the k_categorical_crossentropy function but I could be totally wrong.
In case it would help here is my loss function:
Many thanks in advance for the help of the community!