NLeSC / mcfly

A deep learning tool for time series classification and regression
Apache License 2.0
362 stars 82 forks source link

Update default metric name #214

Closed dafnevk closed 4 years ago

dafnevk commented 4 years ago

When not specifying the metric, you get in calling find_architectures:

ValueError: Invalid metric: "accuracy" is not among the metrics the models was compiled with (['categorical_accuracy'])
dafnevk commented 4 years ago

When we upgrade to tensorflow 2.0, we will no longer need to do the conversion with _get_metric_name, see: https://www.tensorflow.org/guide/migrate

florian-huber commented 4 years ago

@dafnevk, this seemed solved now, right?

saeed349 commented 4 years ago

Hi, I am getting this error with tensorflow==2.2.0rc2.

florian-huber commented 4 years ago

As reaction to our blog post I again received that someone got this error: "ValueError: Invalid metric: “accuracy” is not among the metrics the models was compiled with ([])".

clives commented 4 years ago

Hi, same error "ValueError: Invalid metric: “accuracy” is not among the metrics the models was compiled with ([])" with tensorflow2_p36 in AWS ( using <<AWS Deep Learning AMI (Ubuntu 18.04) >> ). Works fine with tensorflow_p36 in AWS too. btw - nice library, really helpful !

davala commented 4 years ago

I am evaluating McFly for a new project, nice work on this tool set! I wanted to note that I saw this error last week on my first attempts to run the tutorial on a clean install of CentOS 8 with Anaconda 3, using both TF 2.1.0 and 2.0.0. After several failures I set up a clean Ubuntu 20.04 VM and retraced my steps. It works perfectly with no errors on the Ubuntu box. I would be glad to provide any details if they would help. Dank u!

jmrichardson commented 4 years ago

Hi, I am getting this same error. Here is my code where I am specifying "accuracy" as metric:

        self.num_models = 2
        self.num_classes = self.yenc_train.shape[1]
        self.metric = 'accuracy'
        self.models = mcfly.modelgen.generate_models(self.Xseg_train.shape,
                                            number_of_classes=self.num_classes,
                                            number_of_models=self.num_models,
                                            metrics=[self.metric])

        from mcfly.find_architecture import train_models_on_samples
        resultpath = os.path.join('.', 'models')
        if not os.path.exists(resultpath):
            os.makedirs(resultpath)
        outputfile = os.path.join(resultpath, 'modelcomparison.json')
        histories, val_accuracies, val_losses = train_models_on_samples(self.Xseg_train, self.yenc_train,
                                                                        self.Xseg_val, self.yenc_val,
                                                                        self.models, nr_epochs=20,
                                                                        subset_size=300,
                                                                        early_stopping_patience=5,
                                                                        verbose=True,
                                                                        outputfile=outputfile,
                                                                        metric=self.metric)
        print('Details of the training process were stored in ', outputfile)

Keras==2.3.1 tensorflow==2.2.0 mcfly==3.0.0 Windows 10

@davala , what did you do to fix?

Thanks for any help

svenvanderburg commented 4 years ago

I also had the issue as described by @jmrichardson, @florian-huber, @clives , calling find_architectures results in: "ValueError: Invalid metric: “accuracy” is not among the metrics the models was compiled with ([])".. This is different from what @dafnevk reported (note the empty list), but people seem to report it under this issue.

It seems to be related to tensorflow==2.2.0, I tested older versions (2.1.0 and 2.0.0) and they work. So @jmrichardson you can fix it by installing an older version of tensorflow. @dafnevk maybe we should open a separate issue (and (temporarily) solve it by restricting tensorflow<=2.1.0)?

cwmeijer commented 4 years ago

Try to get rid of the support for similar metric names (acc and accuracy) and make the code simpler. Only support TF2.

Also check the tutorial and see if the model there needs retraining with new TF versions.

dafnevk commented 4 years ago

The problem is indeed a change made in tensorflow 2.2, see: https://github.com/tensorflow/tensorflow/issues/37714

florian-huber commented 4 years ago

This should now be mostly solved with PR #240 (still giving errors for python 3.5 though).