Syn-McJ / TFClassify-Unity

An example of using Tensorflow with Unity for image classification and object detection.
MIT License
167 stars 47 forks source link

TFException: Expects arg[0] to be string but float is provided #22

Closed AurekSkyclimber closed 5 years ago

AurekSkyclimber commented 5 years ago

I'm trying to integrate my own trained model. It works well in Python, so not sure why it doesn't work with Tensorflow Sharp. Your included models work great, so it has to be something on my end.

Full error message is TFException: Expects arg[0] to be string but float is provided TensorFlow.TFStatus.CheckMaybeRaise (TensorFlow.TFStatus incomingStatus, System.Boolean last) (at <6ed6db22f8874deba74ffe3e566039be>:0) TensorFlow.TFSession.Run (TensorFlow.TFOutput[] inputs, TensorFlow.TFTensor[] inputValues, TensorFlow.TFOutput[] outputs, TensorFlow.TFOperation[] targetOpers, TensorFlow.TFBuffer runMetadata, TensorFlow.TFBuffer runOptions, TensorFlow.TFStatus status) (at <6ed6db22f8874deba74ffe3e566039be>:0) TensorFlow.TFSession+Runner.Run (TensorFlow.TFStatus status) (at <6ed6db22f8874deba74ffe3e566039be>:0) TFClassify.Classifier+cAnonStorey0.<>m0 () (at Assets/Scripts/Classifier.cs:48)

Unity plugin is version 0.3, Tensorflow is version 1.4 but Python instead of Tensorflow Sharp. Is Python okay, or do I need to train using Tensorflow Sharp? I'm training using the Google Inception model as required for Classify. My training code is the one found here. Should I use the official Inception code found here instead? If so, which version did you use? INPUT_NAME ended up being "DecodeJpeg/contents" and OUTPUT_NAME is "final_result".

Thank you for providing this excellent resource for Unity and Tensorflow!

Syn-McJ commented 5 years ago

Hi @AurekSkyclimber,

It seems like you're trying to do object detection, no? In that case, you need to use Detect scene, not Classify.

Model trained with Python is absolutely fine. I'm using code from TF repo for both classify and detect training, choosing 1.4 tag if training for 0.3 plugin.

I'll try to train a model based on repo you linked and check it.

AurekSkyclimber commented 5 years ago

Thank you so much for attempting to train on the model we are trying to use. Our end goal is to recognize a variety of hand gestures, so we really only need to recognize the one most likely option at each instant. Classify (and by extension, Inception) is probably the better option in this case.

I just realized I linked to the Detection model when I said Inception. My apologies... Here are some better links: Model we trained with Model it is apparently ripped off from Google blog post and official code related to the Inception model which that model is supposedly based off of

I hope this helps. I am currently working to understand how the official Inception model works since this is what Classify is using. With any luck I'll be able to train the model soonish.

By the way, if I grab the 1.4 tag for models, they purposefully removed the research folder. How can I download the 1.4 version of Inception?

(Edit: One last question: Should I train with the original Inception model or with the Slim classification model? There are different approaches to training with each.)

Syn-McJ commented 5 years ago

@AurekSkyclimber, I think you're complicating the task a bit. You don't really need a deep knowledge of how Inception model works (unless you're curious about it) and you don't even need to train it from scratch. You have a neatly organized dataset, so you can take pre-trained CNN model and simply replace top layers with ones trained on your dataset. It's called transfer learning I believe. Because the pre-trained model already learned everything about images, you just need to tune it to your specific data.

Google has a bunch of those models trained on ImageNet dataset and they have a simple script for retraining them for your purpose: here it is for 1.4 version. Usage is something like this:

python retrain.py --image_dir ./dataset --architecture inception_v3

Here is a good blog post on how to use retraining script, the only difference is that you'll need to specify inception_v3 architecture instead of mobilenet, and then specify correct input/output names when labeling images.

I'll update you when finish training and checking it, you can try to use Google's script meanwhile.

Syn-McJ commented 5 years ago

Hi @AurekSkyclimber,

It seems like you're using the wrong input name. The right one would be "Mul". I trained a model using the repo you linked and "Mul" input name works fine with both label_image script from tensorflow repo and my Unity example.

However, there is another problem with Unity example:

TFException: No OpKernel was registered to support Op 'DecodeJpeg' with these attrs.

This is a common problem with Inception model. Mobile platforms do not support DecodeJpeg instruction, which makes Inception model that has this operation unsuitable for use on mobile devices.

The solution would be either to try and train Inception model without this operation (which, unfortunately, I'm not sure how to do, but you probably can google something) or train another type of model - mobilenet, for example. Mobilenet suits for use on mobile platforms, so you won't have any troubles with it.

AurekSkyclimber commented 5 years ago

Thank you for spending so much time helping someone new to Tensorflow out! I had no idea that a more straightforward retrain script existed. Weirdly enough, the Classify and Detect examples don't provide any info on how to train other than the original complicated models. At some point I'll want to learn how the model actually works, but this is definitely more convenient for training purposes. Thank you so much!

Syn-McJ commented 5 years ago

Glad I could help. Btw, the repository you used to train your model uses the exact same method with transfer learning, it is almost identical to Gooogle's retraining script.

In any case, both methods will have "Mul" input name and both methods will use DecodeJpeg operation which will make inception model unusable on mobile platforms. I'd recommend you to train Mobilenet model at first to prototype your app and then figure out how to train Inception without this operation if you require higher accuracy.

I'll close this issue, feel free to reopen if you have more questions.