Closed leoninekev closed 4 years ago
Can you please share the setup.py file you used to creat the package?
Yes, here's my setup.py code:
from setuptools import setup, find_packages
NAME = 'test_code'
VERSION = '0.1'
REQUIRED_PACKAGES = ['keras','h5py']
setup(
name=NAME,
version=VERSION,
packages=find_packages(),
install_requires=REQUIRED_PACKAGES,
scripts=['predictor.py','roi_helpers.py','resnet.py','FixedBatchNormalization.py','RoiPoolingConv.py'])
1) You should get stderr logging. Make sure to set onlinePredictionLogging = True and onlinePredictionConsoleLogging = True when you create the model. 2) On the Python setup, my suspicion is that you need to recompile the package from Python 3.5 to get all the right packages for version 3.5
I re-created the model resource with --enable-console-logging
flag this time, Now i'm getting sterr logs like during training job submission.
But again at Version creation, with --python-version 3.5, after successful pkg collection & installation, it is interrupted with log:
Failed to load model: Unexpected error when loading the model: Shape must be rank 1 but is rank 0 for 'bn_conv1/Reshape_4' (op: 'Reshape') with input shapes: [1,1,1,64], []. (Error code: 0)
Whereas omitting --python-version 3.5, creates a version like before, but last few logs output:
I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA"
i recall the cloudshell output the same when i ran my package locally. what does that mean?
@andrewferlitsch can you take a look at this Keras issue.
The later part pertaining to versioning fail is dealt, with changing TF runtime version
from 1.13
to 1.5
.
But the former prediction fail error persists, despite following the custom prediction routine's documentation. Any help on that?
So for a while to dodge this i've tried running my application using flask
as alternate, it runs well & now i'm up to hosting it on some production server as flask api using google-app-engine maybe? please suggest would app-engine
be recommended over google-ai-platform
(one i was using previously) in this case? what are the downsides?
The error message (shape must be rank 1 but is rank 0) means that the layer expected a 1D vector, but got a scalar value. I googled the exact error message and found several references in Japanese and one in English. The layer 'bn_conv1/Reshape_4' matches the Keras Faster-RNN model. The one English answer to this problem I could find was dated April 19, 2019:
_I had this same error. I seem to have gotten the program to start learning by editing the keras source. In file keras/backend/tensorflowbackend.py I found 4 reshape functions near each other, one of which was involved in the error. I changed the second argument of each of these from (-1) to [(-1)]. This allowed the program to run. Unfortunately, this is a dangerous change since I don’t actually know everything that will be affected.
This is another answer, translated from Japanese, from a posting dated Dec 10, 2018:
When calling faster RCnn shared_layers = nn.nn_base(img_input, trainable=True) , error:
InvalidArgumentError: Shape must be rank 1 but is rank 0 for 'bn_conv1_1/Reshape_4' (op: 'Reshape') with input shapes: [1,1,1,64], [].
After reviewing, it was found to be a problem with BatchNormalization. The following code will be similarly reported.
From keras.layers import BatchNormalization, Input
x = Input(shape=(1, 2, 2))
BatchNormalization(axis=1)(x)
Error: InvalidArgumentError: Shape must be rank 1 but is rank 0 for 'batch_normalization_1/cond/Reshape_4' (op: 'Reshape') with input shapes: [1,1,1,1], [].
There is no problem on the CPU version of keras 2.2.0, there is a problem with the gup version of keras. So the keras version is reduced to 2.1.6:
Pip3 uninstall keras
Pip3 install keras==2.1.6 -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
Not reporting an error
Author: small white north Source: CSDN Original: https://blog.csdn.net/weixin_40755306/article/details/84944008 Copyright statement: This article is the original article of the blogger, please attach the blog post link!
yes, I added exact version numbers for dependencies in my setup.py file, wherein i used keras==2.2.0, That mitigated the versioning error i was getting lately.
But going by Stackdriver logs, for that former anomalous error- "error": "Prediction failed: unknown error."
I noticed it is preceded by error:
Prediction failed: predict() got an unexpected keyword argument 'stats'
I made few modifications to predict method in MyPredictor class above to process b64 encoded image requested through JSON string as:
def predict(self, instances):
inputs= base64.b64decode(instances['image_bytes']['b64'])
inputs= scipy.misc.imread(io.BytesIO(inputs))
inputs= inputs[...,::-1]
[bboxes, probs, ratio]= self.preprocess(inputs)
results = self.postprocess(bboxes, probs, ratio)
return results
At CloudSDK, i'm requesting prediction to that versioned model as:
with open('3.jpg','rb') as image:
img= base64.b64encode(image.read())
instances= {'image_bytes': {'b64': base64.b64encode(img).decode()}}
name = 'projects/{}/models/{}/versions/{}'.format(PROJECT_ID, MODEL_NAME, VERSION_NAME)
response = service.projects().predict(name=name,body={'instances': instances}).execute()
It outputs:
>>>response
{error": "Prediction failed: unknown error."}
But nowhere did i notice or input, keyword argument 'stats'
; Neither during prediction request; Nor in MyPredictor class.
Is there something i'm skipping here?
Following are the extended logs of the above error :
{
insertId: "5d231514000b3916750b34e9"
logName: "projects/project-281612/logs/ml.googleapis.com%2Fprimary.stderr"
receiveTimestamp: "2019-07-08T10:04:04.865116250Z"
resource: {
labels: {
model_id: "Mod_050519"
project_id: "project-281612"
region: ""
version_id: "v5_a"
}
type: "cloudml_model_version"
}
textPayload: "(07/08/2019 10:04:04 AM Prediction failed: predict() got an unexpected keyword argument 'stats'"
timestamp: "2019-07-08T10:04:04.735510Z"
}
Please take a look?
+1 we too are blocked by this bug.
@leoninekev @Dana-Farber Did you try this recommendation from a blog poster who had a similar problem:
There is no problem on the CPU version of keras 2.2.0, there is a problem with the GPU version of keras. So the keras version is reduced to 2.1.6:
Pip3 uninstall keras
Pip3 install keras==2.1.6 -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
There are two bugs -- one dealing with keras and one with gcp version creation. The gcp version creation is our issue since we aren’t using keras.
@dizcology reassigning per Dana-Farber comment that this is not Keras but GCP issue.
Hey Yu-Han can you take a look at this? Thanks!
@leoninekev Apologies for following this up so late - are you still experiencing the issues as mentioned above?
Hey @dizcology have you heard from the user?
No updates. Closing this for now.
@leoninekev @Dana-Farber please reopen this thread if you are still experiencing the issues.
still have the same issue in 2023. ca someone help please?
I'm implementing keras model for object detection. Training of which in ml-engine has successfully resulted in a model_weights.hdf5 file. In order to get online prediction for test images, I'm following custom prediction routine suggested in GC ai-platform documentation here https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/ml_engine/custom-prediction-routines/tensorflow-predictor.py to serve model & its artifact code in cloud for prediction.
for which i modified MyPredictor class in predictor.py module as follows:
During versioning of which if i'm using default python 2.7; Although a version is successfully created but when tested by providing JSON format numpy array converted to list, it throws.
It outputs a list of bboxes & labels when ran locally.
Also on attempting to create another version with --python-version flag 3.5, It now even fails to create a Version, with an error:
I'd really appreicate, any help/corrections to workaround & serve online predictions using this keras weights file?