snowzach / doods

DOODS - Dedicated Open Object Detection Service
MIT License
304 stars 31 forks source link

CPU supports instruction this binary was not compiled to use #21

Closed MYeager1967 closed 4 years ago

MYeager1967 commented 4 years ago

I'm trying to install this on a Synology NAS. As long as I use the tflite engine, things are golden. Once I try to switch it to tensorflow, i get this message and then when something does attempt to request an image scan it takes forever and consumes tons of memory. Do I need to compile my own tensorflow binary or something?

snowzach commented 4 years ago

Most likely. Try using the docker tag noavx and see if that works.

MYeager1967 commented 4 years ago

Running the dockerfile now. Will let you know how it goes. Getting a lot of warnings but I've seen that in compiles before...

snowzach commented 4 years ago

You could also try building it... It may take a LONG time.. and I'm not sure a NAS has enough resources to build tensorflow. There's a branch called rebuild and if you build using the base Dockerfile there, it should be optimized for whatever features your NAS CPU has.

MYeager1967 commented 4 years ago

Personally, I couldn't care less how long it takes. What kind of resources does it need? This thing is lightning fast on scanning the images and if tflite was better at identifying objects, I'd leave it as is. I may eventually wind up with a NUC running some of this stuff, but I'm trying to make due until I see a need.

MYeager1967 commented 4 years ago

I'me going to ask a stupid question that I should have asked a long time ago. When you said to try the noavx tag, was that an option when launching the container? I'm several hours into running the dockerfile that I hope will generate an image that will work but I'm going to kick myself if a simple tag added to the docker run command would have accomplished the same thing...

snowzach commented 4 years ago

noavx is one of the tags you can pick... Building your own image will result in an image optimized for your specific processor. It's probably the best way to go. Using noavx might have worked.. might not have too.

MYeager1967 commented 4 years ago

My running the dockerfile left me a container that apparently did nothing. How do you use the noavx tag? I tried --noavx and it kicked it back. What was supposed to happen when that dockerfile finished?

snowzach commented 4 years ago

What command did you use to build the container?

You should be able to tag it... docker build -t doods:nas . And then run it with docker run -it -p 8080:8080 doods:nas

MYeager1967 commented 4 years ago

I simply navigated to the directory and used "sudo docker build ." Will try again today with the tag although I'm having issues getting it to take it. Really want to have a good, reliable version of this as it kicks ass...

OK, have it running again. Command used was "sudo docker build . -t doods:nas"

snowzach commented 4 years ago

Cool! If you're good. I'll close this ticket

MYeager1967 commented 4 years ago

I wish they had a way to contact you without opening a ticket. Anyhow, when the build is done, am I still going to have to map out to the config and tensorflow data or is it being built into the container? If you leave this open for now, I can still contact you with questions and or issues. If you need to close it, is there another way to contact you?

Where would I find that rebuild branch in case I want to try that route? Edit - Look under "branches", dummy.... :-)

My NAS is "Apollo Lake" architecture and is x64. It's an Intel chip (Atom). What dockerfile would you suggest to get the best chance of success? Also, is it possible to build the full tensorflow models right into the container? If I'm building it anyway, why not have it ready to go, right?

snowzach commented 4 years ago

If you use just the Dockerfile, that should build the best image for your nas. You can edit the file to add more models if you like.

MYeager1967 commented 4 years ago

I guess I should have asked the question differently. What is already in the build as is? I'll look at it and see if I can figure it out, but I'm not that good with actually building docker images from the ground up. The processor in my NAS doesn't support AVX, so I'm hoping the noavx build will solve the problem. After that, I'll see if I can figure out the script to get the proper models and labels file. I'm assuming at this point that the tflite models are still part of the build and I'll need external models to use tensorflow.

I see the rebuild branchhas a standard dockerfile and a dockerfile.noavx so I'm guessing I'd still want the noavx version. I'd hate to spend hours compiling this monster to find out it built it with avx even though my processor didn't support it. Maybe I should just spring for that sexy i7 NUC and get it over with... :-)

MYeager1967 commented 4 years ago

Ok. Started a rebuild using the plain Dockerfile found under the rebuild tree. The noavx version still gave me the warning about CPU supporting instructions the binary wasn't compiled for. Also got 404 errors in the log when I tried to run it. What fun!

MYeager1967 commented 4 years ago

Step 38/50 : RUN make ---> Running in 22cdd494443e make: *** No targets specified and no makefile found. Stop. The command '/bin/sh -c make' returned a non-zero code: 2 Mike@DS418Play:/volume1/docker/test$

So close, yet so far.... Stopped right here. The hard part is done and there's likely no chance in hell of salvaging it at this point. Didn't realize I was going to need the makefile. Needless to say, I'm not a docker master....

I grabbed the makefile and restarted the process. It appears as though it's all in buffer somewhere and it went fairly quick up until it got to the makefile. Then it tells me this:

Step 38/50 : RUN make ---> Running in c95ba8ec1401 fatal: not a git repository (or any of the parent directories): .git go list -m: not using modules GO111MODULE=off go get github.com/gogo/protobuf/proto go get github.com/gogo/protobuf/protoc-gen-gogoslick go get github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway go get github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger make: *** No rule to make target 'server/rpc/version.pb.gw.go', needed by 'doods'. Stop. The command '/bin/sh -c make' returned a non-zero code: 2

It's killing me man!!! I'm going to try this again, but at 7+ hours per shot, I'm waiting for any input you may have as to why it died...

MYeager1967 commented 4 years ago

Any idea why this isn't compiling? I have the dockerfile and the makefile in the directory that I'm trying to build it from. What other files do I need there? It's apparently choking on the server/rpc/version.pb.gw.go part...

MYeager1967 commented 4 years ago

I figured out how to get the image to build. Off to the next possible roadblock...

snowzach commented 4 years ago

Awesome! Sorry man, my personal life is insane right now and I haven't had any time to work on this. I want to get it much easier to build for people so they can have the custom instructions. Was there anything wrong with the files that I can repair or just some other issue?

MYeager1967 commented 4 years ago

Well, I finally downloaded the zip and unzipped the entire thing to a folder. After that, it just ran. Still not certain it's going to work, as I posted in the other issue concerning the 404's, but it compiled. I figured there might have been an issue with the build as it used cache for the final build after all those screwed up attempts, so last night I built it again with the no-cache flag. It's sitting there waiting for me to run it now.... Best I can tell, the issue building was mine as recopying the files and setting it loose worked perfectly.

hmmz77 commented 4 years ago

I've been following this as I'm trying to compile it on my Synology NAS as well... do you mind sharing what commands you used to get it to build correctly? I've copied the rebuild branch to the NAS, how did you run the Dockerfile?

MYeager1967 commented 4 years ago

Give me a bit and I'll post it. What NAS are you using?

On Sat, Jun 6, 2020, 11:30 PM hmmz77 notifications@github.com wrote:

I've been following this as I'm trying to compile it on my Synology NAS as well... do you mind sharing what commands you used to get it to build correctly? I've copied the rebuild branch to the NAS, how did you run the Dockerfile?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/snowzach/doods/issues/21#issuecomment-640151556, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGBF5X7RC5S7HHO324442LRVMCXHANCNFSM4NOM23EQ .

hmmz77 commented 4 years ago

Thanks for the reply. Using a DS918+. It's currently compiling using the command you gave above "sudo docker build . -t doods:nas". I SSH into the rebuild folder on the NAS which I copied from GitHub then ran that. I hope that's the right way...

MYeager1967 commented 4 years ago

That's exactly what I did. You have about 9 hours before it's done. Your system uses the same processor as mine, so if yours runs, I'd like to get a copy of it. I'd offer you a copy of mine, but it only seems to work with the tflite model at the moment. Gives 404 errors when trying to run the tensorflow detector. You're welcome to it anyhow if you'd like...

hmmz77 commented 4 years ago

Thanks for doing the detective work and writing up what you did. I'll let you know how the compile goes. Only about 2 hours in so far. Thanks for the offer, I'd like tensorflow to work too as the accuracy of tflite isn't good enough. Happy to share if it does!

MYeager1967 commented 4 years ago

Really hoping Snowzach will pop in and say "oh yeah, I know how to fix that", but he's gotten busy with something in the real world and we simply have to hope that everything is alright there. Until then, we plug along and hope to find something that works. The alternative for me is an i5 NUC to move stuff onto. I know I could probably get away with something a little lower powered, but if I'm buying hardware...

While you're waiting, you can always open another SSH session and run the script to get the models. I wound up doing it copy and paste one line at a time because I couldn't get the script to execute, but your mileage may vary...

MYeager1967 commented 4 years ago

hmmz77, you have a dropbox? I figured out my issue with the 404's and am now chasing down how to get it to choose the detector I want... Progress is being made (I think). Still can't get it to run the tensorflow model....

Started another build with the SSE4.1 and SSE4.2 flags set. I always get a warning that the binary wasn't built for them and it always defaults to the tflite model. I'm going to figure this out as I don't want to spend the money on another piece of hardware right this moment. It's a principle thing....

hmmz77 commented 4 years ago

The good news... it compiled without any errors and works exactly the same as the prebuild no-avx build. ..

The bad news... it's exactly the same as the prebuild version... it still comes up with the error of not supporting SSE etc... Exact same speed.

Could you let me know how you set it to build with those flags?

Your 404 error is fairly simple though... I know because I was banging my head against it for a few days.. It's probably because the config in HA is using the cut and paste config which says "detector: default". If you also look at the config in doods config in docker there are names for each one. The TFlite one is called "default". You just need to change either the docker config name to "default" or the HA config detector to which ever you called the other models in the docker config.

hmmz77 commented 4 years ago

For anybody watching at home.. I used these models and timed roughly how long they took to process. These are the ones I've tested to work on the NAS.

ssd_mobilenet_v1_coco_2018_01_28.pb 0.402 sec coco_ssd_mobilenet_v1_1.0_quant.tflite 0.297 sec faster_rcnn_inception_v2_coco_2018_01_28.pb 8.5 sec ssd_inception_v2_coco_2018_01_28.pb 1.2 sec

MYeager1967 commented 4 years ago

I have it compiled. Still can't get the tensorflow detector to run, but it is faster. If you've got a dropbox, I'll send it to you. I no longer get the SSE4.1/SSE4.2 warnings though... :-)

hmmz77 commented 4 years ago

I'd really like if you could please send it.

Here's my dropbox.

Let me know how you go with the config.

MYeager1967 commented 4 years ago

Actually, I just pushed it to the Docker Hub. If you search the registry real quick, you should see it. It's myeager/doods-apollo-lake:nas or something close. Let me know if you see it. WAY quicker than dropbox. Still working on trying to get it to use the tensorflow model. Comes right up and doesn't give me any of the crap from before, but still defaults to the tflite model. Says it loaded the tensorflow detector though...

hmmz77 commented 4 years ago

Thank you! I'll give that a try.

Maybe I didn't explain properly before. The 404 error happens because of a config mismatch.

HA uses the config "detector: default" to query the docker for "name: default".

On the example config the tensorflow model is called "name: tensorflow", therefore you need to rename in HA "detector: tensorflow". Alternatively you can just modify the docker config to something like this:

- name: default
  type: tensorflow
  modelFile: models/faster_rcnn_inception_v2_coco_2018_01_28.pb
  labelFile: models/coco_labels1.txt
  width: 224
  height: 224
  numThreads: 4
  numConcurrent: 1

Remember to download the labels from here https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-2014_2017.txt and rename it to coco_labels1

hmmz77 commented 4 years ago

Okay conclusion....

No more SSE errors, practically the same speed.

One of those cases where the effort is not really worth the reward. :(

MYeager1967 commented 4 years ago

I already had the coco/label1.txt file in my models directory. I made the changes above and no joy. Now I get a message about exceeding 10% of system memory. Please tell me your luck is better and you can get me through this!

hmmz77 commented 4 years ago

That error is normal... it's a big fat model. Takes like 16 seconds to process first then 8 seconds on second attempt.

Try using ssd_mobilenet_v1_coco_2018_01_28.pb. Still a tensorflow model. Can get it from the model zoo. https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

EDIT: Forgot to mention you need to rename the file frozen_interface_graph to the model name. That's the tensorflow file.

MYeager1967 commented 4 years ago

Gotcha. Now it runs, but it's SLOW! Going after the mobile coco files. Now the question is, are the mobilenet files anywhere near as good as the regular models? The accuracy has to be better than the tflite models so it's not a total waste.... I'm really not using it for much more than spotting people anyhow.

With the mobilenet file, it's about the same speed as the tflite detector was. If the accuracy is better, I'll call it a win (and a learning experience). I've learned quite a bit about docker over the past few days. I'll forget most of it and have to look it back up, but that's normal. I don't use it that much...

Just saw the post about the times. I'm taking right at a second to process an image (1080p from video) with the mobilenet model you told me to get. I'm guessing I have a bit more going on on my machine than you do. I'll be happy enough with that if the object identification is accurate. I can't take a trash can getting 68% confidence of being a person....

hmmz77 commented 4 years ago

Well it's always going to be a balance.. there's quite a few models which I found just didn't even run. If you find a good one let me know that strikes a better balance. I'm currently using ssd_mobilenet_v1_coco_2018_01_28.pb, that takes about 0.4sec to run for me. Only 1 docker container though with 12gb of RAM.

I switched to this one after the tflite identified a pot plant as a person. Works well during the day but at night you'll have to adjust the confidence levels as they go right now.

Just trial and error really...

MYeager1967 commented 4 years ago

I'll play with it a bit. I have HA and motionEye containers running along with this one. Have a few others but they take almost no resource. That's the reason I'll eventually move some of this to a NUC. Small, powerful and low power requirements. My NAS also runs an Emby server as an application. That shouldn't be using much at the moment though. I credited you on the docker hub page for helping me get to the end of this journey. I might have found someone else that could push me over the line, but you were there to do it... :-)

hmmz77 commented 4 years ago

More grunt will definately be appreciated... might be better just to get the coral stick if all the extra you need is faster tensorflow and everything else works fine.

Haha no worries. You saved me hours of compiling and digging how to's!

Oh and massive thanks to snowzach for doods for making this whole adventure possible!

snowzach commented 4 years ago

Hey guys.. It's really a tradeoff. The Coral stick will only help you with tensorflow lite models (which is generally the mobilenet models) The mobilenet models only work with smaller resolution images. DOODS will resize the image down to say 300x300 before running it through mobilenet, at least on the tflite detector. It may not need to when using mobilenet with tensorflow. Inception uses the full resolution image which is why it takes a lot longer, but you get a lot more accuracy as well. I'm still having a think about how this could be optimized and still use full inception model but be usable and faster.

hmmz77 commented 4 years ago

Thank you for the reply!

Now it makes sense why there is such a huge difference in processing time.

Is there any way to choose how doods will resize the feed? Ideally I'd like to get the processing time between 0.5 - 1 sec with an inception model.

snowzach commented 4 years ago

DOODs only resizes images if the model requires a specific size. Mobilenet requires 300x300, inception doesn't require a size. If you want to resize, you'll need to do it before it gets to DOODS. The way the models usually work, I'm not honestly sure if it will make it faster or not. If I ever get any time, maybe I can add an option to do that. Another couple weeks and I should have more time to check it out.

snowzach commented 4 years ago

I added #23 to remind me to do it

hmmz77 commented 4 years ago

I just tested ssd_inception_v2_coco_2018_01_28.pb with the lower resolution feed of my camera vs the normal high quality feed. Same processing time. Thanks for spending time on this. Amazing work.

MYeager1967 commented 4 years ago

I'm getting WAY better accuracy out of the ssd_mobilenet_v1_coco_2018_01_28 file than I got out of the default tflite decoder. That said, I'm quite happy at this point as the processing time still hovers at about .8 to a 1.1 and I'm feeding it a 1080p image. Now to tweak the confidence just a bit. It's identified people very well so far, but the neighbors car across the street has a tire that shows up as a person (very low confidence though). My doods-nas image has a few pulls already as well. I'm glad I put it out there if it helps other. The processing time difference between the nas version optimized (SSE4.1/4.2) and the original nas version is about 3 seconds with this model. It's a jump!

snowzach commented 4 years ago

That's great! Honestly, I haven't messed with the mobilenet model on full Tensorflow much. Perhaps @hmmz77 should try that one and see if it works for him.

MYeager1967 commented 4 years ago

He's the one that told me to try it. Without his help, I'd still be wondering why I couldn't get the tensorflow detector to work. When I tried to use it, I got 404 errors or warnings about using too much memory. He set me straight on what was going on and why, plus how to fix it.

Is there something that limits how many objects are detected in a frame? I had four people in the picture but it only identified one...