psmoveservice / psmove-ue4

Plugin for using PSMove as input into Unreal Engine 4. Currently obsolete. Please use PSMoveService.
GNU General Public License v2.0
59 stars 28 forks source link

Windows 7x64 Slow Orientation Tracking #18

Open Ixstala opened 8 years ago

Ixstala commented 8 years ago

Hi, Great work getting all of this together and working rather easily on Windows. I've got your Psmove libraries working in my graphics framework and the tracker seems to be working fine (even great) and reports >400FPS tracking a single move.

The problem arises when I try to get orientation estimates using psmove_fusion_get_projection_matrix & psmove_fusion_get_modelview_matrix; the orientation updates very slowly, ~1 or more seconds of lag. This affects psmove_get_accelerometer_frame as well.

I checked the raw accelerometer, magnetometer, & Gyro data, it is streaming fast, so I'm not sure why the orientation estimate is so slow. The orientation tracking works perfectly with your as built binaries. I've run the magnetometer calibration too and both files are in my AppData directory.

Any ideas?

If I can't get it working I'll probably just do my own implementation of the sensor fusion bit, although I would like to avoid that - it's so close to working!

Thanks!

brendanwalker commented 8 years ago

@rovarma Some updates on my testing

The fork you linked to appears to contain the vanilla version of PS3EYEDriver (ie. without my changes). Is that the version you ran the usb monitor against?

Ah oops, I meant to link to my "rovarma-optimization" branch that includes your changes:

https://github.com/brendanwalker/PS3EYEDriver/tree/rovarma-optimization

This version has the updateDevices call commented out in the update loop of the test app as well as your other changes I brought over:

https://github.com/brendanwalker/PS3EYEDriver/blob/rovarma-optimization/sdl/main.cpp#L114

This was the version that I used to get those graphs from. I also finally got around to recording me running my build of ps3eye_sdl:

https://youtu.be/2cVb4bHNKuQ

In addition to watching me be a spaz, you can see there are a few minor hitches in the video, but overall it holds at a solid 60fps, with that sleep() commented out:

https://github.com/brendanwalker/PS3EYEDriver/blob/rovarma-optimization/sdl/main.cpp#L206

Independent of this, I tried incorporating your ps3eye changes into my branch of psmoveapi used for psmove-unity5. The test_tracker and test_opengl tools in that fork are definately snappier than they were before. I then incorporated that new version of psmoveapi into a branch of psmove-unity5 to test and it's looking good there too:

https://youtu.be/qI_lyUydsbc

On a side note, I noticed that if I closed the ps3eye_sdl test I was getting a consistent double-delete crash in the Semaphore::Destroy() call here:

https://github.com/brendanwalker/PS3EYEDriver/blob/rovarma-optimization/src/ps3eye.cpp#L141

I added an check to make sure that it wouldn't crash if you call that method twice, but I'm not sure if that is the right place to make that check. It seemed to fix the crash for me though.

rovarma commented 8 years ago

Hmm. Thanks for the info. That's really weird to me.

If you're using my version, I really wouldn't expect the sleep in the main loop to impact the bandwidth of the camera at all. The entire point of my change is that the data transfer from the camera is decoupled from the rate at which frames are actually consumed.

With that sleep there, what I would expect is that you'd potentially present at < 60 FPS, but that the driver would continue grabbing at 60 FPS, so maintaining a steady 36 MB/s. That's clearly not what's happening for you though.

I do notice that the SDL example is also polling the frames from a separate thread, which is then polled again by the mainthread. Perhaps there's some weird interaction going on there? I'll try and have a look at your branch tonight, see if I can reproduce anything.

About the frame corruption: Does the frame corruption appear only in the SDL test app? Or does it also appear in, say, test_tracker? If it's only in the SDL app, I would imagine it's simply due to the fact that the code is not thread safe.

There is a thread memcpy'ing data from the driver into the ctx->buffers data structure, while the main thread is at the same time continuously memcpy'ing data from ctx->buffers into the framebuffer. This is all completely unsynchronized, so very possible for the main thread to present a half-memcpy'd buffer/other random garbage (for example), which would look like corrupted frames.

I would recommend removing that separate polling thread in the SDL app: simply get the frames directly from the driver (you don't even need the various yuv422_buffer_t structures anymore) and present those. If you make this change, I wonder if the USB bandwidth drop is still present when using SDL_Delay(10). Can you test this (if you're still awake)?

Glad to hear that the FPS is otherwise nice and steady at 60 FPS though!

rovarma commented 8 years ago

BTW @cboulay @brendanwalker I had a think about the code structure of the new driver some more and I think that with the change that the producer thread will no longer block if the buffer is full, I can simplify the change quite a bit, keeping it very close to the original driver API. I'll try and test this tonight in a new branch.

rovarma commented 8 years ago

@cboulay Just tried including < thread > in ps3eye.cpp again. These are the errors I'm seeing:

C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(122): error C2065: 'INTMAX_MAX' : undeclared identifier C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(133) : see reference to class template instantiation 'std::ratio<_Nx,_Dx>' being compiled C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(124): error C2065: 'INTMAX_MAX' : undeclared identifier C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(44): error C2065: 'INTMAX_MAX' : undeclared identifier C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(217) : see reference to class template instantiation 'std::_Safe_mult<0x01,0x01>' being compiled C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(36): error C2338: integer arithmetic overflow C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(44) : see reference to class template instantiation 'std::_Safe_multX<0x01,0x01,false>' being compiled C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(44): error C2039: 'value' : is not a member of 'std::_Safe_multX<0x01,0x01,false>' C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(44): error C2065: 'value' : undeclared identifier C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(44): error C2057: expected constant expression C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(44): error C2039: 'value' : is not a member of 'std::_Safe_multX<0x01,0x0989680,false>' C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(219): error C2975: '_Nx' : invalid template argument for 'std::ratio', expected compile-time constant expression C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(116) : see declaration of '_Nx' C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(219): error C2975: '_Dx' : invalid template argument for 'std::ratio', expected compile-time constant expression C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\ratio(117) : see declaration of '_Dx'

Including < thread > in a different solution/project works just fine, however, so it has to be something specific to either ps3eye.cpp or to the psmoveapi project configuration. I'll try and see if I can get this to work so I can ditch the homegrown threading primitives.

In addition, as I mentioned in my post above, I think with the change of FrameQueue::Enqueue no longer blocking if the buffer is full, I can get rid of the thread-per-camera approach and just run the event loop in a single thread. I need to test this performance-wise (especially in a multi-camera scenario), but if it works, it should simplify the change quite a lot; the API would remain largely the same. That, in turn, will probably help with upstream adoption.

I hope to be able to test this over the weekend. I'd recommend holding off on merging my current optimization branch into any of your projects until I get this part sorted; the resulting code/change is likely to be much smaller and cleaner.

@brendanwalker I tried out your fork of ps3eye_sdl (with my optimizations) and I cannot reproduce the USB bandwidth drops when putting the SDL_Delay(10) in. However, I do get some occasional frame corruption. Turning off the thread polling the camera in main.cpp and just polling the camera in the mainloop directly makes this go away for me.

The fixed main loop looks like this:

while (ctx.running) {
    while (SDL_PollEvent(&e)) {
        if (e.type == SDL_QUIT) {
            ctx.running = false;
        }
    }

    // TODO: Proper thread signalling to wait for next available buffer
    //SDL_Delay(10);

    uint8_t *new_pixels = ctx.eye->getFrame();

    SDL_LockTexture(video_tex, NULL, &video_tex_pixels, &pitch);
    memcpy(video_tex_pixels, new_pixels, ctx.eye->getRowBytes() * ctx.eye->getHeight());
    SDL_UnlockTexture(video_tex);

    free(new_pixels);

    SDL_RenderCopy(renderer, video_tex, NULL, NULL);
    SDL_RenderPresent(renderer);
}

Can you try and see if this also fixes the corruption you're seeing? I'm also curious if the USB bandwidth drops still happen if you switch to this main loop and put back the SDL_Delay.

brendanwalker commented 8 years ago

@rovarma I updated my version of ps3eye_sdl to rip out the thread polling in main.cpp and just call ctx.eye->getFrame() directly as you suggested. That completely fixes the frame corruption I was seeing. I also added back in the SDL_Delay(10) and it doesn't see to be a problem anymore. I should note that between now and the last time I tested this I added a new USB 3.0 expansion card that the PS3 eye is now connected to. Not sure if this should make a difference or not (I was plugged into another USB 3.0 port on my MOBO previously). In any event, everything seems to be running well now. These changes have been checked into my branch

rovarma commented 8 years ago

@brendanwalker Cool, that's good to hear. I'm currently working on simplifying the change, should be done soon.

BTW, you probably want to free(new_pixels); in the loop somewhere; you're leaking 36 MB/s now ;)

rovarma commented 8 years ago

@cboulay @brendanwalker I've finished the clean version of my changes. It's here.

Can you guys do another test to see if performance is still solid for you? @cboulay Can you try building this with OSX and/or Linux again to see if anything is missing?

If everything still works okay and builds on other platforms, I'll make a pull request to inspirit's repo for all of this.

cboulay commented 8 years ago

Looks great. I won't have time to try it until tonight (about 11 hrs from now).

brendanwalker commented 8 years ago

@rovarma I just merged your latest changes from your "optimization-clean" branch. The only differences I have now is that I left in the FPS counter in main.cpp and my msvc project files (since I don't have Mingw installed). My forks of ps3eye_sdl and psmove api built and ran great. I also updated my "rovarma-optimization" branch of psmove-unity5 and everything runs great there too.

cboulay commented 8 years ago

I knew I shouldn't have promised a time. I was asked to sub for my old ultimate team. I'll do this when I get back.

cboulay commented 8 years ago

@rovarma The ps3eyedriver.h/.cpp files do not exist in the inspirit upstream. Those files were created by thp to give a C API. While it's not a bad idea for inspirit to provide a C API, so it might get merged upstream, I think there might be more acceptance it if the filenames were changed. i.e., ps3eye_capi.h/.cpp or something similar. This was a point of confusion for me when I first started with ps3eyedriver. I think both repos will be better off with this change but it's not a major issue.

+1 for adding the FPS counter

I had to modify the psmoveapi/src/CMakeLists.txt so Xcode would support C++11, necessary for <atomic> and <threading>.

With those changes, the camera worked well for me. I tested ps3eye_sdl and test_opengl. I don't have a PSMove with me so I can't comment on the effect on tracking.

rovarma commented 8 years ago

Thanks for testing.

I'll add the FPS counter in and make the required changes for C++11 support on XCode tonight, then see about opening a pull request to inspirit/thp.

rovarma commented 8 years ago

@cboulay @brendanwalker I've added the FPS counter to the SDL app + renamed the ps3eyedriver.h/cpp here. Also modified psmoveapi/src/CMakeLists.txt as @cboulay said to make it build under XCode.

I've opened PRs in all relevant places for all of this (https://github.com/inspirit/PS3EYEDriver/pull/21, https://github.com/thp/PS3EYEDriver/pull/15, https://github.com/thp/psmoveapi/pull/200).

Thanks for testing and helping out.

brendanwalker commented 8 years ago

@rovarma Thanks for implementing this optimization! It's been nice using the snappier camera tracking here at work. And it's allowed me to can rip out my janky implementation of the CLEye driver support I had in psmove-unity5 (since the ps3eye driver is now delivering frames just as fast).

rovarma commented 8 years ago

No problem! Glad it works for you.

KRSteinke commented 8 years ago

Awesome work rovarma! Your work is greatly appreciated! Even by us lurkers.

brendanwalker commented 8 years ago

@rovarma @cboulay FYI I just found one small issue with the latest camera driver that only manifests if you attempt to close and re-open a camera in the same process. In all our test apps we've never do this (open camera on start-up, close on shut-down). However in both psmove-ue4 and in psmove-unity5 we setup and teardown the tracker context whenever play is hit on the game. The USBMgr singleton has a exit_signaled flag used to tell the camera thread to exit. Since USBMgr stays around for the lifetime of the process, the exit_signaled flag wasn't getting reset after the camera thread finished shutting down. The next time you try to start up the camera thread it exits right away and then the FrameQueue::Dequeue() function will block forever waiting for a new frame that will never come. I made a fix in my fork of PS3EYEDriver here that resets the flag.

rovarma commented 8 years ago

@brendanwalker Ah, good catch. Thanks for the fix. Can you open a PR for that to inspirit's repo, or shall I?

brendanwalker commented 8 years ago

I can open a PR tonight. I've been meaning to bring my fork of PS3EyeDriver in line with inspirits anyway. EDIT: PR submitted to inspirit/PS3EYEDriver:master: https://github.com/inspirit/PS3EYEDriver/pull/22