godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.11k stars 20.2k forks source link

Inaccurate timing behavior on LineageOS #31837

Closed xuvatilavv closed 1 year ago

xuvatilavv commented 5 years ago

Godot version: b17b51d and 3.1.1 Stable

OS/device including version: LineageOS 16 on Essential PH-1 (mata)

Issue description: Exported .apk runs with a very fast time rate on my phone, including Timers and incremental behavior called from _physics_process. A simple example with a Timer:

What happened: A Timer started with a wait_time of 1.0 seconds will emit its timeout signal 12 or more times per second if left with the default process_mode of Timer.TIMER_PROCESS_IDLE. If process_mode is set to Timer.TIMER_PROCESS_PHYSICS, the timeout signal will be emitted closer to 6 times per second. What was expected: A Timer started with a timeout of 1.0 seconds will emit its "timeout" signal once every second.

Note that the issue occurs when calling methods from _physics_process and with any other wait_time too, Timer just provides a convenient way to demonstrate the problem with exact expected values.

Additionally, I checked get_physics_process_delta_time() and get_process_delta_time() whenever this prints. Physics delta was reported as a consistent 0.016667 but the frame time was reported as varying very large positive and negative numbers.

Steps to reproduce: Create a scene, add a Node2D and attach a script. In that script, add a Timer, connect the timeout signal to a method that prints OS.get_system_time_secs(), and add the timer as a child. Call start(1.0) on the timer. Export to phone and observe that the system time is printed multiple times for each second.

Minimal reproduction project: bugtest.zip

The attached project is as described above, using a Timer as reference. It also includes a RichTextLabel display to show the reported information directly on the phone screen.

lawnjelly commented 5 years ago

Interesting. Without having the hardware in questions, I'd guess it's most likely a problem with duff values coming from the OS timer system calls. I don't think there is a check for negative deltas currently, and using unsigned math in timers can give crazy results. I was aware of this as a potential issue.

Having recently worked on the timing code, I would suggest the best place for this check might be in the OS functions, rather than in main.cpp or main_timer_sync.cpp.

Probably all calls to get_ticks_usec() should call a wrapper in the OS base class which stores the previous time, and prevents the function returning a usec earlier than that reported by a previous call. I'll have a see what I can do.

xuvatilavv commented 5 years ago

I've compiled the commit you've made and tested it on my device with the same project. I've still got the problem, but one of two things will happen. Either I get the same accelerated time as above but now all the frame times read as -139812 (before the number would be different about every second), or I get no response at all (the app loads but never prints, whether from _init, _ready, _process, or _physics_process).

lawnjelly commented 5 years ago

I've compiled the commit you've made and tested it on my device with the same project. I've still got the problem, but one of two things will happen. Either I get the same accelerated time as above but now all the frame times read as -139812 (before the number would be different about every second), or I get no response at all (the app loads but never prints, whether from _init, _ready, _process, or _physics_process).

That's actually very useful info. It suggests your OS might be giving some wildly out of sync results on different calls to the OS time function, in which case I might modify the PR to use the alternative method I mentioned there (resyncing when it encounters a negative delta). I was hoping that with multicores slightly out of sync it would quickly 'catch up' but that may not be happening on your system (in the case where you get no response at all).

It is very probably your timers are giving some very buggy values, but we should be able to deal with them! :smile: Don't panic I'll tweak the PR tomorrow. I may try and figure out some way of getting some figures so we can see exactly what is happening on your system.

lawnjelly commented 5 years ago

Another possible source of error - I've just noticed that OS_Unix::get_ticks_raw_usec() isn't checking the return value of clock_gettime. It could be failing, maybe that assumption that it will work is incorrect:

uint64_t OS_Unix::get_ticks_raw_usec() const {

#if defined(__APPLE__)
    uint64_t longtime = mach_absolute_time() * _clock_scale;
#else
    // Unchecked return. Static analyzers might complain.
    // If _setup_clock() succeeded, we assume clock_gettime() works.
    struct timespec tv_now = { 0, 0 };
    clock_gettime(GODOT_CLOCK, &tv_now);
    uint64_t longtime = ((uint64_t)tv_now.tv_nsec / 1000L) + (uint64_t)tv_now.tv_sec * 1000000L;
#endif
    longtime -= _clock_start;

    return longtime;
}
xuvatilavv commented 5 years ago

Great, thanks! Let me know if I can help provide more information on the issue.

If it makes it any different, it's not just something that occurs on Timers. Here's a minimal project where I count up to 1.0 with delta from _physics_process() and it has the same issue: timingbug2.zip The delta within that method and the one returned from get_physics_process_delta_time() both report as 0.16667 still.

For now I'm using Engine.time_scale = 0.2 as a workaround, and it works well enough that I can keep making my project.

lawnjelly commented 5 years ago

The physics reported time will always be the same because it is based on the physics fps in the settings, it is really just a symptom of the frame delta timings being haywire (even if the frame timing is wildly huge, a max of 8 physics ticks will run on a frame).

However this is all good it's made me notice even more potential bugs in the existing system. So far:

  1. On each subsequent call to get_ticks_usec() time may move backward giving overflow due to unsigned math (current - prev gives a very large number with unsigned math).
  2. In unix time the subsequent calls to get_ticks_usec() are based on a clock which may be in the future, again unsigned math overflow
  3. clock_gettime may fail .. there was no accounting for this
  4. There may be bugs in the API func giving spurious values, maybe our robust timer should deal with this possibility if possible?

I'm going to try to account for these as best we can, and will try and make some more debugging info in the PR so you can try out and we can try and pin down what is happening on your system. It will probably need adb logcat output to get the debugging messages (or remote debug from the Godot IDE if this works for you, I've never got this working).

lawnjelly commented 5 years ago

Just to keep this issue up to date:

After extensive testing trying to rule out problems on Godot side we were running out of ideas, clock_gettime seems to be reporting time significantly in the past and in the future in a cyclical fashion with both MONOTONIC_RAW and MONOTONIC (rather than the small differences normally expected from e.g. running on a different core). Although I can't rule out a bug in something on our side, we can't rule out the possibility that there could be a Lineage OS bug in clock_gettime() on this particular hardware, notable that the cores are clocked to different frequencies:

https://www.gsmarena.com/_essential_ph_1-8710.php

CPU | Octa-core (4x2.45 GHz Kryo & 4x1.9 GHz Kryo)

Faless commented 5 years ago

It's worth noticing that OS.get_system_time_secs is NOT monotonic and it is expected to be subject to NTP adjustment. OS.get_ticks_msec on the other end, should be monotonic.

akien-mga commented 3 years ago

https://github.com/godotengine/godot/pull/31863#issuecomment-537865475 hints that this might be a bug of the specific LineageOS ROM used.

Can you still reproduce the issue in 3.2.3-stable @xuvatilavv, and can anyone else reproduce the same issue on Android?

xuvatilavv commented 3 years ago

Had to dig out the old phone😄I replaced it a while back.

Testing on Godot 3.2.3-stable and LineageOS 16.0-20200413-NIGHTLY-mata, I've still got the same issue as before. The next update for that device is to LineageOS 17.1 and requires a full reflash, but if it'll help I can test on that version as well when I've got the time to set it up.

mhatz-2 commented 3 years ago

Hello there,

I've encountered a similar issue as @xuvatilavv a few days ago (my original question on Godot Q&A).

Godot version: 3.2.3 Linux x64 stable standard version 3.2.4 beta 2 Linux x64

OS/device target LineageOS 17.1 on tablet Samsung Galaxy A10 (SM-T580)

Issue description: When exporting project to my device, animations and timers get processed at crazy times. A log of delta values from _process() method shows values like : -139812 -139812 ... 1957340 1957340 ... -2236964 -2236964 ... 31317468 31317468 ... -35791396 -35791396 ...

I didn't found any solution for now nor I can say if it's a Godot or a LineageOS issue (or a sum of both). Tell me if I can help in any way, I'll do my best.

akien-mga commented 3 years ago

Since the two reports are from LineageOS and none from Android / other vendored AOSP build, I guess the bug lies in the LineageOS ROM. Maybe we can still work it around but as much as I like LineageOS myself, this doesn't make it very high priority since it only affects a fraction of users.

mhatz-2 commented 3 years ago

Do you have some hints about what to look for on the LineageOS side? What could Godot need to work properly?

lawnjelly commented 3 years ago

You could try this commit that I made to look at fixing it originally: https://github.com/lawnjelly/godot/commit/829322246761ee055d3cb303f58a5fe1bbb1dc84

However it may end up being more complex to fix depending on how bad the outputs are from LineageOS.

steowens commented 3 years ago

`

var FIRE_RATE = 4

func _physics_process(delta):
    # do other stuff
    if fire and cooldown <= 0:
         fire_bullet()
         cooldown = 1/FIRE_RATE
    elif cooldown > 0:
     cooldown -= delta

When I run in Godot Debugger the gun fires at about 4 times per second, however exporting the code to a linux project and running, the gun fires like a vulcan cannon, WAY faster than 4 times per second. No idea why.

var FIRE_RATE = 4

func _physics_process(delta):
    # do other stuff
    if fire and cooldown <= 0:
         fire_bullet()
         cooldown = 1.0 / FIRE_RATE
    elif cooldown > 0:
     cooldown -= delta

` However changing the 1 to a 1.0 seems to fix the issue. Edited: Slaps forhead, because math. Integers round down on linux and apparently the type conversions come out different in the compiled code than when run in the debugger. In the debugger 1/4 = 0.25, but in linux 1/4 = 0.

Calinou commented 3 years ago

@steowens Is this solved if you use var FIRE_RATE: int = 4 or var FIRE_RATE := 4? See https://github.com/godotengine/godot/issues/42224.

Calinou commented 1 year ago

@xuvatilavv Can you (or anyone else) still reproduce this bug in Godot 3.5.2 or any later release?

Calinou commented 1 year ago

Closing due to lack of response. Please comment if you can still reproduce this bug on the latest Godot version.