cerndb / dist-keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
http://joerihermans.com/work/distributed-keras/
GNU General Public License v3.0
623 stars 169 forks source link

'SequentialWorker' object has no attribute 'add_history' #69

Open dbl001 opened 6 years ago

dbl001 commented 6 years ago

My training seems to 'hang' on the SingleTrainer: E.g.

trainer = SingleTrainer(keras_model=model, worker_optimizer=optimizer,
                        loss=loss, features_col="features_normalized",
                        label_col="label_output", num_epoch=5, batch_size=32)
trained_model = trainer.train(training_set)
'SequentialWorker' object has no attribute 'add_history'
[Stage 16:>                                                         (0 + 1) / 1]

It happens on OS X 10.11.6

David-Laxers-MacBook-Pro:~ davidlaxer$ conda list tensorflow
# packages in environment at /Users/davidlaxer/anaconda:
#
# Name                    Version                   Build  Channel
tensorflow                1.8.0                     <pip>
tensorflow-tensorboard    0.1.8                     <pip>
David-Laxers-MacBook-Pro:~ davidlaxer$ conda list keras
# packages in environment at /Users/davidlaxer/anaconda:
#
# Name                    Version                   Build  Channel
dist-keras                0.2.1                     <pip>
Keras                     2.1.6                     <pip>
Keras-Applications        1.0.1                     <pip>
Keras-Preprocessing       1.0.1                     <pip>
plaidml-keras             0.0.0.dev0                <pip>
David-Laxers-MacBook-Pro:~ davidlaxer$ 

as well as on Ubuntu 14.04LTS:

(spacy) ubuntu@ip-10-0-1-112:~/dist-keras$ conda list tensorflow
# packages in environment at /home/ubuntu/anaconda/envs/spacy:
#
# Name                    Version                   Build  Channel
tensorflow                1.3.0                         0  
tensorflow-base           1.3.0            py36h5293eaa_1  
tensorflow-gpu            1.3.0                         0  
tensorflow-gpu-base       1.3.0           py36cuda8.0cudnn6.0_1  
tensorflow-tensorboard    0.1.5                    py36_0  
(spacy) ubuntu@ip-10-0-1-112:~/dist-keras$ conda list keras
# packages in environment at /home/ubuntu/anaconda/envs/spacy:
#
# Name                    Version                   Build  Channel
keras                     2.1.5                    py36_0  
(spacy) ubuntu@ip-10-0-1-112:~/dist-keras$ 

SparkContext

Spark UI

Version
v2.4.0-SNAPSHOT
Master
local[*]
AppName
myAppName

screen shot 2018-06-18 at 11 25 43 am screen shot 2018-06-18 at 11 25 47 am

Any ideas?

dbl001 commented 6 years ago

I Added this method to the class SequentialWorker in the file 'distkeras/workers.py':

 def add_history(self, h):
        """Appends the specified history data."""
        d = {}
        d['history'] = h
        d['timestamp'] = time.time()
        self.training_history.append(d)

SequentialWorker seems to work now.