ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
251 stars 109 forks source link

Batch Normalization Failure in Eval #682

Open zhangpengshan opened 4 years ago

zhangpengshan commented 4 years ago

If enable batch normalization in Keras, got exception in eval:

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)

2019-11-06 00:07:16,519 ERROR [main] ml.shifu.shifu.tensorflow.TensorflowModel: Error in model inference: java.lang.IllegalArgumentException: You must feed a value for placeholder tensor 'bn/keras_learning_phase' with dtype bool [[Node: bn/keras_learning_phase = Placeholder[_output_shapes=[], dtype=DT_BOOL, shape=, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] at org.tensorflow.Session.run(Native Method) at org.tensorflow.Session.access$100(Session.java:48) at org.tensorflow.Session$Runner.runHelper(Session.java:298) at org.tensorflow.Session$Runner.run(Session.java:248) at ml.shifu.shifu.tensorflow.TensorflowModel.compute(TensorflowModel.java:86) at ml.shifu.shifu.core.GenericModel.compute(GenericModel.java:47) at ml.shifu.shifu.core.Scorer$7.call(Scorer.java:397) at ml.shifu.shifu.core.Scorer$7.call(Scorer.java:394) at ml.shifu.shifu.core.Scorer.scoreNsData(Scorer.java:404) at ml.shifu.shifu.core.Scorer.scoreNsData(Scorer.java:223) at ml.shifu.shifu.core.ModelRunner.computeNsData(ModelRunner.java:190) at ml.shifu.shifu.udf.EvalScoreUDF.exec(EvalScoreUDF.java:305) at ml.shifu.shifu.udf.EvalScoreUDF.exec(EvalScoreUDF.java:56) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:323) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:362) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:361) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:383) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:303) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma

zhangpengshan commented 4 years ago

Change:

https://github.com/ShifuML/shifu/blob/develop/src/main/python/distributed_tf_keras.py#L65

to

    layer = keras.layers.Dense(num_hidden_nodes[i], name='hidden_layer_'+str(i+1), kernel_regularizer=kernel_regularizer, kernel_initializer=kernel_initializer)(previous_layer)
    layer = keras.layers.BatchNormalization(name='bn')(layer)
    layer = keras.layers.Activation(acti)(layer)
    layer = keras.layers.Dropout(0.2)(layer)

Enable batch normalization.