rstudio / keras3

R Interface to Keras
https://keras3.posit.co/
Other
833 stars 282 forks source link

Inconsistant result between two version of keras (maybe tensorflow) #774

Closed mgendarme closed 5 years ago

mgendarme commented 5 years ago

Dear all, End of last year i was playing with some U-Net (example of my last post). My model did retrieve loss curves like this: Expl_R351

Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=de_DE.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_DE.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2.2   doMC_1.3.5       iterators_1.0.10 foreach_1.4.4    reticulate_1.10  EBImage_4.24.0   forcats_0.3.0    stringr_1.3.1    dplyr_0.7.7     
[10] purrr_0.2.5      readr_1.1.1      tidyr_0.8.2      tibble_1.4.2     ggplot2_3.1.0    tidyverse_1.2.1  keras_2.2.0.9001

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0          locfit_1.5-9.1      lubridate_1.7.4     lattice_0.20-38     fftwtools_0.9-8     png_0.1-7           assertthat_0.2.0   
 [8] zeallot_0.1.0       digest_0.6.18       R6_2.3.0            tiff_0.1-5          cellranger_1.1.0    plyr_1.8.4          backports_1.1.2    
[15] httr_1.3.1          pillar_1.3.0        tfruns_1.4          rlang_0.3.0.1       lazyeval_0.2.1      readxl_1.1.0        rstudioapi_0.8     
[22] whisker_0.3-2       Matrix_1.2-15       htmlwidgets_1.3     RCurl_1.95-4.11     munsell_0.5.0       broom_0.5.0         compiler_3.5.1     
[29] modelr_0.1.2        pkgconfig_2.0.2     BiocGenerics_0.28.0 base64enc_0.1-3     tensorflow_1.10     htmltools_0.3.6     tidyselect_0.2.5   
[36] codetools_0.2-15    crayon_1.3.4        withr_2.1.2         bitops_1.0-6        grid_3.5.1          nlme_3.1-137        jsonlite_1.6       
[43] gtable_0.2.0        magrittr_1.5        scales_1.0.0        cli_1.0.1           stringi_1.2.4       xml2_1.2.0          generics_0.0.1     
[50] tools_3.5.1         glue_1.3.0          hms_0.4.2           jpeg_0.1-8          abind_1.4-5         yaml_2.2.0          colorspace_1.3-2   

I have updated R and all the libraries, my session now looks like this:

Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] doMC_1.3.5         iterators_1.0.10   foreach_1.4.4      magick_2.0        
 [5] reticulate_1.12    EBImage_4.26.0     forcats_0.4.0      stringr_1.4.0     
 [9] dplyr_0.8.0.1      purrr_0.3.2        readr_1.3.1        tidyr_0.8.3       
[13] tibble_2.1.1       ggplot2_3.1.1      tidyverse_1.2.1    keras_2.2.4.1.9001

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1          locfit_1.5-9.1      lubridate_1.7.4    
 [4] lattice_0.20-38     fftwtools_0.9-8     png_0.1-7          
 [7] assertthat_0.2.1    zeallot_0.1.0       digest_0.6.18      
[10] R6_2.4.0            tiff_0.1-5          cellranger_1.1.0   
[13] plyr_1.8.4          backports_1.1.4     httr_1.4.0         
[16] pillar_1.3.1        tfruns_1.4          rlang_0.3.4        
[19] lazyeval_0.2.2      readxl_1.3.1        rstudioapi_0.10    
[22] whisker_0.3-2       Matrix_1.2-17       htmlwidgets_1.3    
[25] RCurl_1.95-4.12     munsell_0.5.0       broom_0.5.2        
[28] compiler_3.6.0      modelr_0.1.4        pkgconfig_2.0.2    
[31] BiocGenerics_0.30.0 base64enc_0.1-3     tensorflow_1.13.1  
[34] htmltools_0.3.6     tidyselect_0.2.5    codetools_0.2-16   
[37] crayon_1.3.4        withr_2.1.2         bitops_1.0-6       
[40] grid_3.6.0          nlme_3.1-139        jsonlite_1.6       
[43] gtable_0.3.0        magrittr_1.5        scales_1.0.0       
[46] cli_1.1.0           stringi_1.4.3       xml2_1.2.0         
[49] generics_0.0.2      tools_3.6.0         glue_1.3.1         
[52] hms_0.4.2           jpeg_0.1-8          abind_1.4-5        
[55] yaml_2.2.0          colorspace_1.4-1    rvest_0.3.3        
[58] haven_2.1.0

The behaviour of the loss over iterations looks like this: Expl_R360

The code did not change only upgrading R and running it on a local GPU instead of a CPU. (I tried also the same model with the CPU installation and it wasn't better)

Any idea where this is coming from? I am suspecting a change in the k_categorical_crossentropy function but I could be totally wrong.

In case it would help here is my loss function:

## Loss function

## DICE COEFICIENT
dice_coef <- function(y_true, y_pred, smooth = 1.0) {
  y_true_f <- k_flatten(y_true)
  y_pred_f <- k_flatten(y_pred)
  intersection <- k_sum(y_true_f * y_pred_f)
  #(2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth) 
  k_mean((2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth)) # for use in combination with bce
}
attr(dice_coef, "py_function_name") <- "dice_coef"

dice_coef_loss_for_bce <- function(y_true, y_pred){
  1 - dice_coef(y_true, y_pred)
}
attr(dice_coef_loss_for_bce, "py_function_name") <- "dice_coef_loss_for_bce"

dice_coef_loss_bce_3Classes <- function(y_true, y_pred, l_b_c = L_B_C, w_class_1 = W_CLASS_1, w_class_2 = W_CLASS_2, w_class_3 = W_CLASS_3){
  k_categorical_crossentropy(y_true, y_pred) * l_b_c + 
    dice_coef_loss_for_bce(y_true[,,,1], y_pred[,,,1]) * w_class_1 +
    dice_coef_loss_for_bce(y_true[,,,2], y_pred[,,,2]) * w_class_2 +
    dice_coef_loss_for_bce(y_true[,,,3], y_pred[,,,3]) * w_class_3
}
attr(dice_coef_loss_bce_3Classes, "py_function_name") <- "dice_coef_loss_bce_3Classes"

Many thanks in advance for the help of the community!

skeydan commented 5 years ago

Hi, that's a big change of versions from TF 1.10 to TF 1.13... One major thing that changed is that we're using tf.keras by default now instead of keras.

I'd start by seeing what happens when you add

use_implementation("keras")

right after library(keras).

mgendarme commented 5 years ago

Hi Sigrid,

Trying what you said I run into this problem:

> LossFactory <- function(CLASS){
+   if(CLASS == 1){
+     model <- model %>%
+       compile(loss = dice_coef_loss_bce, #dice_coef_loss_bce,  #dice_coef_loss_bce_multiClasses, #k_categorical_crossentropy,
+               optimizer = "adam", # used instead of the classical stochastic gradient descent procedure
+               metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)#, # 'accuracy'
+       )
+   } else if(CLASS == 2) {
+     model <- model %>%
+       compile(loss = dice_coef_loss_bce_2Classes, #dice_coef_loss_bce,  #dice_coef_loss_bce_multiClasses, #k_categorical_crossentropy,
+               optimizer = "adam", # used instead of the classical stochastic gradient descent procedure
+               metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)#, # 'accuracy'
+       )
+   } else if(CLASS == 3){
+     model <- model %>%
+       compile(loss = dice_coef_loss_bce_3Classes, #dice_coef_loss_bce,  #dice_coef_loss_bce_multiClasses, #k_categorical_crossentropy,
+               optimizer = "adam", # used instead of the classical stochastic gradient descent procedure
+               metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)#, # 'accuracy'
+       )
+   } 
+ }
> LossFactory(CLASS = CLASS)
 Show Traceback

 Rerun with Debug
 Error in UseMethod("compile") : 
  no applicable method for 'compile' applied to an object of class "c('tensorflow.python.keras.engine.training.Model', 'tensorflow.python.keras.engine.network.Network', 'tensorflow.python.keras.engine.base_layer.Layer', 'tensorflow.python.training.checkpointable.base.CheckpointableBase', 'python.builtin.object')" 

Is it possible that the k_categorical_crossentropy from tf.keras and keras are retrieving such different results? If yes is there a way to just call k_categorical_crossentropy from keras and not tf.keras like we would with e.g keras::functionofchoice?

mgendarme commented 5 years ago

This might help: Source code of cross entropy from Keras:

def categorical_crossentropy(output, target, from_logits=False):
    """Categorical crossentropy between an output tensor and a target tensor.
    # Arguments
        output: A tensor resulting from a softmax
            (unless `from_logits` is True, in which
            case `output` is expected to be the logits).
        target: A tensor of the same shape as `output`.
        from_logits: Boolean, whether `output` is the
            result of a softmax, or is a tensor of logits.
    # Returns
        Output tensor.
    """
    # Note: tf.nn.softmax_cross_entropy_with_logits
    # expects logits, Keras expects probabilities.
    if not from_logits:
        # scale preds so that the class probas of each sample sum to 1
        output /= tf.reduce_sum(output,
                                reduction_indices=len(output.get_shape()) - 1,
                                keep_dims=True)
        # manual computation of crossentropy
        epsilon = _to_tensor(_EPSILON, output.dtype.base_dtype)
        output = tf.clip_by_value(output, epsilon, 1. - epsilon)
        return - tf.reduce_sum(target * tf.log(output),
                               reduction_indices=len(output.get_shape()) - 1)
    else:
        return tf.nn.softmax_cross_entropy_with_logits(labels=target,
logits=output)

from tf.keras :

class CategoricalCrossentropy(Loss):
  """Computes categorical cross entropy loss between the `y_true` and `y_pred`.
  Usage:
 python
  cce = tf.keras.losses.CategoricalCrossentropy()
  loss = cce(
    [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]],
    [[.9, .05, .05], [.5, .89, .6], [.05, .01, .94]])
  print('Loss: ', loss.numpy())  # Loss: 0.3239

  Usage with tf.keras API:
 python
  model = keras.models.Model(inputs, outputs)
  model.compile('sgd', loss=tf.keras.losses.CategoricalCrossentropy())

  Args:
    from_logits: Whether `output` is expected to be a logits tensor. By default,
      we consider that `output` encodes a probability distribution.
    label_smoothing: If greater than `0` then smooth the labels. This option is
      currently not supported when `y_pred` is a sparse input (not one-hot).
    reduction: Type of `tf.losses.Reduction` to apply to loss. Default value is
      `SUM_OVER_BATCH_SIZE`.
    name: Optional name for the op.
  """

  def __init__(self,
               from_logits=False,
               label_smoothing=0,
               reduction=losses_impl.ReductionV2.SUM_OVER_BATCH_SIZE,
               name=None):
    super(CategoricalCrossentropy, self).__init__(
        reduction=reduction, name=name)
    self.from_logits = from_logits
    self.label_smoothing = label_smoothing

  def call(self, y_true, y_pred):
    """Invokes the `CategoricalCrossentropy` instance.
    Args:
      y_true: Ground truth values.
      y_pred: The predicted values.
    Returns:
      Categorical cross entropy losses.
    """
    y_pred = ops.convert_to_tensor(y_pred)
    y_true = ops.convert_to_tensor(y_true)
    is_sparse = y_pred.shape != y_true.shape

    if is_sparse:
      return sparse_categorical_crossentropy(
          y_true, y_pred, from_logits=self.from_logits)
    else:
      y_true = math_ops.cast(y_true, y_pred.dtype)
      if self.label_smoothing > 0:
        num_classes = math_ops.cast(array_ops.shape(y_true)[1], y_pred.dtype)
        smooth_positives = 1.0 - self.label_smoothing
        smooth_negatives = self.label_smoothing / num_classes
        y_true = y_true * smooth_positives + smooth_negatives

      return categorical_crossentropy(
y_true, y_pred, from_logits=self.from_logits)
dfalbel commented 5 years ago

This second issue is fixed in the github version of Keras. Can you install keras with devtools::install_github("rstudio/keras") and try again?

mgendarme commented 5 years ago

Hi Daniel, That's exactly how I installed keras.

I get the error only when running first:

library(keras) use_implementation("keras")

mgendarme commented 5 years ago

From the source code from tf.keras and keras what I understand is that keras does return scaled predictions: scale preds so that the class probas of each sample sum to 1 whereas tf.keras doesn't do that. Is there an easy way to achieve this ?

@dfalbel I tried again to reinstall keras as ou mentionned (though that's how I did it) and nothing did change.

mgendarme commented 5 years ago

Here is another example of the same problem that wasn't answered.

Any clue how to solve this?

Thanks in advance

skeydan commented 5 years ago

Hm, you use only k_categorical_crossentropy directly in your code right? (As opposed to that crossvalidated issue which contrasts the function k_categorical_crossentropy with using the string 'categorical_crossentropy' in a compile statement)?

I'm asking because I see recent changes in the corresponding loss class (https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L328)

def __init__(self,
               from_logits=False,
               label_smoothing=0,
               reduction=losses_impl.ReductionV2.SUM_OVER_BATCH_SIZE,
name=None):

but the implementations of the functions look more or less identical to me between tf.keras and keras, as of today:

https://github.com/keras-team/keras/blob/9d33a024e3893ec2a4a15601261f44725c6715d1/keras/backend/tensorflow_backend.py#L3527

https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/backend.py#L3837

In any case, this should just be a matter of scaling, and not alter the behavior too much. Looking at your loss curves above, it pretty much looks like there were changes in the optimizer behavior and now, your learning rate is too high. I'd try a lower learning rate, and / or learning rate decay (I'd experiment with different values and see what happens).

mgendarme commented 5 years ago

Hm, you use only k_categorical_crossentropy directly in your code right? Yes it looks like this:

model <- model %>%
compile(
loss = dice_coef_loss_bce_3Classes,
optimizer = "adam",
metrics = custom_metric("dice_coef_loss_for_bce", dice_coef_loss_for_bce)
)

with


dice_coef_loss_bce_3Classes <- function(y_true, y_pred,
l_b_c = L_B_C,
w_class_1 = W_CLASS_1,
w_class_2 = W_CLASS_2,
w_class_3 = W_CLASS_3){

k_categorical_crossentropy(y_true, y_pred) l_b_c + dice_coef_loss_for_bce(y_true[,,,1], y_pred[,,,1]) w_class_1 + dice_coef_loss_for_bce(y_true[,,,2], y_pred[,,,2]) w_class_2 + dice_coef_loss_for_bce(y_true[,,,3], y_pred[,,,3]) w_class_3 } attr(dice_coef_loss_bce_3Classes, "py_function_name") <- "dice_coef_loss_bce_3Classes"

The weights were for the testing:

l_b_c = .7 w_class_1 = .1 w_class_2 = .1 w_class_3 = .1



>In any case, this should just be a matter of scaling, and not alter the behavior too much. Looking at your loss curves above, it pretty much looks like there were changes in the optimizer behavior and now, your learning rate is too high. I'd try a lower learning rate, and / or learning rate decay (I'd experiment with different values and see what happens).

I will try to play with the learning rate and the learning rate decay. Thanks for the advise. The only thing that I do not understand is that I did not change the parameters of the optimizer (in fact I used solely ` optimizer = "adam"` as opposed to `optimizer_adam()`)
Is it possible that the default settings of the optimizer have changed with the upgrade from keras to the tf.keras backend? The two graphs I showed are coming from the same model ran with the "old" and "new" version of keras/tensorflow. What I mean with that is that my model was already optimized to some extent and provided good results which were completely lost after update. I did not have to tweak the learning rate before. Any thoughts on that?
Thanks for your help. 
skeydan commented 5 years ago

Is it possible that the default settings of the optimizer have changed with the upgrade from keras to the tf.keras backend?

Are you using the current release (1.13) or the nightly build? If 1.13 I would have thought the behavior should not have changed ; however in the nightly, it seems like the v2 versions of the optimizers are now being used by default: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/optimizer_v2/adam.py

By the way, if you are not using the nightly, it might be interesting to test the behavior with it....: install_tensorflow(version="nightly")

mgendarme commented 5 years ago

Are you using the current release (1.13) or the nightly build? If 1.13 I would have thought the behavior should not have changed ; however in the nightly, it seems like the v2 versions of the optimizers are now being used by default: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/optimizer_v2/adam.py

This is not very re-insuring as I am not using the nightly version, which leaves me without an explanation for why the model before/after update performs so differently.

May be something that I could mention is that I have installed keras like this:

conda create -y --name r-tensorflow tensorflow-gpu python=3.6.8

It was the only way to have keras run on the GPU for me (solution posted by @dfalbel at some point)

skeydan commented 5 years ago

you could test the nightly using

conda create -y --name r-tensorflow tf-nightly-gpu python=3.6.8

Otherwise I can only suggest trying learning rate decay and/or a lower learning rate overall...

mgendarme commented 5 years ago

I tried to upgrade tensorflow to the nightly version within R and got this error:

install_tensorflow(version="nightly-gpu")
## Package Plan ##
 environment location: /home/gendarme/anaconda3/envs/r-tensorflow
The following packages will be REMOVED:
  ca-certificates-2019.1.23-0
  certifi-2019.3.9-py36_0
  libedit-3.1.20181209-hc058e9b_0
  libffi-3.2.1-hd88cf55_4
  libgcc-ng-8.2.0-hdf63c60_1
  libstdcxx-ng-8.2.0-hdf63c60_1
  ncurses-6.1-he6710b0_1
  openssl-1.1.1b-h7b6447c_1
  pip-19.1.1-py36_0
  python-3.6.8-h0371630_0
  readline-7.0-h7b6447c_5
  setuptools-41.0.1-py36_0
  sqlite-3.28.0-h7b6447c_0
  tk-8.6.8-hbc83047_0
  wheel-0.33.4-py36_0
  xz-5.2.4-h14c3975_4
  zlib-1.2.11-h7b6447c_3

Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done

Remove all packages in environment /home/gendarme/anaconda3/envs/r-tensorflow:

Creating r-tensorflow conda environment for TensorFlow installation...
Collecting package metadata: ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /home/gendarme/anaconda3/envs/r-tensorflow

  added / updated specs:
    - python=3.6

The following NEW packages will be INSTALLED:

  ca-certificates    pkgs/main/linux-64::ca-certificates-2019.1.23-0
  certifi            pkgs/main/linux-64::certifi-2019.3.9-py36_0
  libedit            pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
  libffi             pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
  ncurses            pkgs/main/linux-64::ncurses-6.1-he6710b0_1
  openssl            pkgs/main/linux-64::openssl-1.1.1b-h7b6447c_1
  pip                pkgs/main/linux-64::pip-19.1.1-py36_0
  python             pkgs/main/linux-64::python-3.6.8-h0371630_0
  readline           pkgs/main/linux-64::readline-7.0-h7b6447c_5
  setuptools         pkgs/main/linux-64::setuptools-41.0.1-py36_0
  sqlite             pkgs/main/linux-64::sqlite-3.28.0-h7b6447c_0
  tk                 pkgs/main/linux-64::tk-8.6.8-hbc83047_0
  wheel              pkgs/main/linux-64::wheel-0.33.4-py36_0
  xz                 pkgs/main/linux-64::xz-5.2.4-h14c3975_4
  zlib               pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3

Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use:
# > conda activate r-tensorflow
#
# To deactivate an active environment, use:
# > conda deactivate
#

Installing TensorFlow...
Collecting tensorflow-gpu==nightly from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
  ERROR: HTTP error 404 while getting https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
  ERROR: Could not install requirement tensorflow-gpu==nightly from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl because of error 404 Client Error: Not Found for url: https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
ERROR: Could not install requirement tensorflow-gpu==nightly from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl because of HTTP error 404 Client Error: Not Found for url: https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl for URL https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-nightly-cp36-cp36m-linux_x86_64.whl
Error: Error 1 occurred installing packages into conda environment r-tensorflow

Also when trying to install it with conda i get this:

conda create -y --name r-tensorflow tf-nightly-gpu python=3.6.8
Collecting package metadata: done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - tf-nightly-gpu

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/linux-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

In the mean time I am experimenting with the learning rate and the learning rate decay. It seems my model is constently "over"-predicting my second class for some reason (though that was not an issue before). So no solution yet at the horizon.

skeydan commented 5 years ago

Is there anything we can do to help here? Otherwise I'd go on and close the issue.

mgendarme commented 5 years ago

Thanks for your help and sorry for the delay in my response.

So far it seems that by tweaking the parameters of the optimizer I am getting results closer to before but not identical. Also it seems to be extremely sensitive and I did not find an optimal way for identify the best parameters so far. It seems from my testing that there is only a very limited range of values resulting in a proper result.

On the side note I am still having an over-prediction of my second class in most of the cases that I cannot really get rid off (though class 1 and 3 are predicted very accurately). In case you have any thoughts on this I would be happy to read them.