RosettaDrone / rosettadrone

MAVlink and H.264 Video for DJI drones
BSD 3-Clause "New" or "Revised" License
306 stars 116 forks source link

Video lost when we swap between map and video in main screen (Applies to "WORK" branch) #47

Closed The1only closed 1 year ago

The1only commented 3 years ago

We now got the google map working, and can set the street map of the satellite image, however when we swap back and forth between the small map and video (by hitting the small tile), we lose video.

m4xw commented 3 years ago

fixed on my fork. Issue was the surface being killed so i created one via EGL and use the setOutputSurface (android 23+) api of mediacodec to swap it around when surfaceDestroyed to off-screen and later swap it back. That keeps the decoder alive and happy. Will find its way into the PR soon.

Fwiw with some minor tweaks, should work for all drones. The transcoder -> decoding stuff isnt ideal but fixes a shitton of garbo frame issues that appear, the transcoded feed is then duped and raw transmitted as h264 to GC, so all that together with re-issued keyframes from the transcoder, allows the GC stream to function at the same time as the rosetta video feed and it can connect later and actually have a keyframe available.

kripper commented 1 year ago

See new vars useOutputSurface and useCustomDecoder. The video code is a mess and must be cleaned and separated from MainActivity. The new Plugin interface could be used to keep the code easier to maintain. The video code should also be remerged with the latest version of the DJI Video Sample App. This version includes Rosetta's RTP streaming code: https://github.com/kripper/DJI-Android-VideoStreamDecodingSample

m4xw commented 1 year ago

A mess is a understatement, bunch of friday nights went into rewriting that on my end

kripper commented 1 year ago

I'm sure every line of code costed many hours of research and testing.

m4xw commented 1 year ago

Mostly the video stuff, but also mostly because DJI code was super cursed for over a dozen versions of my S21, so i was essential hunting ghosts to get whatever implementation doesnt choke the hw decoder in kernel :P

kripper commented 1 year ago

Yeah, I ended up sending YUV frames via TCP. RTMP was too slow and RTP has issues with the NALU splitting on the Mini (I believe).

The latest video sample app from DJI does things we are not doing on Rosetta. For example, they add a keyframe to the queue which is obtained by a magic resource ID that is different for every drone model.

Can you please check if it works with your mini 2? https://github.com/kripper/DJI-Android-VideoStreamDecodingSample

Winter is only good for hacking at home on stuff like the video issues.

Regarding the code, we need to:

My video sample app fork repo with minimal changes from Rosetta should finally replace all the video related code we have in Rosetta once it works.

In general, we should keep original package names to avoid doing changes of code that is maintained upstream.

m4xw commented 1 year ago

For example, they add a keyframe to the queue which is obtained by a magic resource ID that is different for every drone model.

Its just a prepared keyframe for the drones, their reference impl absolutely did nott work (on my phone) and hence i needed to do a roundabout

m4xw commented 1 year ago

Basically you need to transcode to recieve proper periodic keyframes

kripper commented 1 year ago

Please note that I never merged your pull request and I also rewrote the PID motion routines recently (and removed the incomplete cruising mode which was the only code I started to merge from your fork), so all your effort and time you worked hacking on the video is only preserved at the moment in your own experience and knowledge, but not in Rosetta's codebase.

It would be nice if you could somehow share the valuable results of your hard work (FWIW) via a pull request based on master.

m4xw commented 1 year ago

Cruising mode was not incomplete, the only thing missing was flying proper arcs on early wp handoff. Any change in behaviour to that will likely not lead to the desired result, didnt look at your changes tho. Regarding PID you essentially can't change it. The algos don't change and thus the values dont change if u feed it the same data. If you remove PID or change their individual tuning then the drone will fly like shit (no smooth lines) and it will always under/overshoot things

You arent testing it under real conditions, are you? :P

kripper commented 1 year ago

Btw, i added an internal simulator to test and debug PID code You can now fly a virtual drone in QGC without a real drone. It is enabled when your enter test mode (5 clicks on the logo)

m4xw commented 1 year ago

Btw, i added an internal simulator to test and debug PID code You can now fly a virtual drone in QGC without a real drone. It is enabled when your enter test mode (5 clicks on the logo)

Yea that doesnt work, simulator also simulates wind speed, drag coefficient among many other things (the result does not carry over)

m4xw commented 1 year ago

Also without proper PID, even in your sim, it might not show that, but it will constantly stop-counteract-stop-resume when its closing in on the target values if they arent finely tuned. The values will constantly be too high or too low and it needs precision to walk that edge

kripper commented 1 year ago

I didn't remove the PIDs of course. Just the code using the PID for cruising mode that I found on your pull request.

m4xw commented 1 year ago

I didn't remove the PIDs of course. Just the code using the PID for cruising mode that I found on your pull request.

Oh thats mandatory. If you set integral (iirc) not to zero it WILL counteract, thus removing the whole purpose of cruise mode.

Its supposed to stop SHORT of the target, let inertia do the rest and then continue to the next waypoint.

Yours will stick for a while on target until it confirms the GPS target, thats approx 3-5sec overhead per waypoint, quickly sums up!

m4xw commented 1 year ago

To do it with PID's and have the same behaviour you would need to calculate a arc from the pos, the target and the new waypoint and fly that and hijack the motion complete logic

m4xw commented 1 year ago

image

m4xw commented 1 year ago

Also, relevant for this issue (#47), did you also fix when its running in the background? I had to move it to a EGL offscreen buffer so QGC stream works when switching to the QGC app (rosetta in bg)

kripper commented 1 year ago

Yeah, during my initial "big merge of all Rosetta repos, forks and branches", I actually used the radius provided in the MAVLink message. I just reverted back to the original motion algorithms when I rewrote the motion code recently, because the drone was descending with no reason (maybe because of flying too fast while rotating the yaw at the same time). I'm not sure, but I also didn't want to break things. But the fly-in-radius and slow-down features can be added again.

Apropos, I believe the slow down should be soft. Maybe the derivative component can be used for that instead of using a "if(distance < constant) => switchToCruisingMode()". When ascending I noticed the drone reduced velocity violently on the last inches.

kripper commented 1 year ago

Also, relevant for this issue (#47), did you also fix when its running in the background? I had to move it to a EGL offscreen buffer so QGC stream works when switching to the QGC app (rosetta in bg)

I fixed some related issues (for example, there was a crash when MediaCodec was being used at the same time), but the code changed a lot and must be tested again.

kripper commented 1 year ago

Regarding video, I won't continue maintaining or working on video related code until 1) the DJI sample app + Rosetta's RTP streaming code [1] works for me (DJI Mini SE) and 2) we replace all the video code in Rosetta for this (upstream maintained) code.

[1] : https://github.com/kripper/DJI-Android-VideoStreamDecodingSample

m4xw commented 1 year ago

When ascending I noticed the drone reduced velocity violently on the last inches.

That happens when the integral is too big

kripper commented 1 year ago

Or because I was changing the PID components when close to the target. BTW, do you think we should reset the PID under some circumstances?

m4xw commented 1 year ago

reset is to prevent a overfit on the data (wind may change etc), so should be done periodically, the sweet spot i cant tell u.

m4xw commented 1 year ago

also you absolutely have to reset when changing the pid values

kripper commented 1 year ago

I believe we should reset explicitly the PID every time we change significantly the setpoint (target value). For example, when we receive a new GOTO command or when we reached a WP.

Anyway it seems that in MiniPID.getOutput() the accumualted error is automatically reset when it is too big.

m4xw commented 1 year ago

I believe we should reset explicitly the PID every time we change significantly the setpoint (target value). For example, when we receive a new GOTO command or when we reached a WP.

Anyway it seems that in MiniPID.getOutput() the accumualted error is automatically reset when it is too big.

I should remind you, this is a system where human lives could be on the line. By the time the error accumulated it may be too late already, generally a good reset point is between wp's, but it then also might show a boundary-jumping behaviour on the first signal inputs. PID isnt smart, it goes one direction until the signal doesnt go the way it wants to. So at first it may send output to go in the wrong direction and correct it next iter

kripper commented 1 year ago

That is the problem. If the WP are rounded and continous (connected, non-stop) and if we reset the PID in the middle we will have the boundary jumping behaviour. Currently we are not resetting at all.

MiniPID.getOutput() is exectued each 50 ms and resets this way:

        // Figure out what we're doing with the error.
        if (minOutput != maxOutput && !bounded(output, minOutput, maxOutput)) {
            errorSum = error;
            // reset the error sum to a sane level
            // Setting to current error ensures a smooth transition when the P term
            // decreases enough for the I term to start acting upon the controller
            // From that point the I term will build up as would be expected
        } else if (outputRampRate != 0 && !bounded(output, lastOutput - outputRampRate, lastOutput + outputRampRate)) {
            errorSum = error;
        } else if (maxIOutput != 0) {
            errorSum = constrain(errorSum + error, -maxError, maxError);
            // In addition to output limiting directly, we also want to prevent I term
            // buildup, so restrict the error directly
        } else {
            errorSum += error;
        }
m4xw commented 1 year ago

ramp it up? Slight overshoots are normal pid behaviour tho, you just need to minimize on the start. Ofc this costs latency