jetperch / pyjoulescope_ui

Joulescope graphical user interface
https://www.joulescope.com
Apache License 2.0
89 stars 26 forks source link

#Time marker falls back after 5450.91sec log time #291

Closed dschiedsch closed 20 hours ago

dschiedsch commented 5 days ago

Joulescope model

JS220

UI version

1.2.1

What OS are you seeing the problem on?

Windows 10

What happened?

While logging with 50Hz i encountered that the timestamp in my logs falls back after 5450.91sec or after line 272527

here is the log starting from line 272523 5450.83,0.00843953,4.84234,0.0408671,498.404,2409.76 5450.85,0.00844139,4.84235,0.0408762,498.404,2409.76 5450.87,0.00844158,4.84234,0.040877,498.404,2409.76 5450.89,0.00844035,4.84239,0.0408715,498.404,2409.76 5450.91,0.00843744,4.84236,0.0408571,498.404,2409.77 3303.44,0.0084382,4.84234,0.0408606,498.404,2409.77 3303.46,0.00843958,4.84235,0.0408674,498.405,2409.77 3303.48,0.00843948,4.84233,0.0408667,498.405,2409.77 3303.50,0.00844342,4.84235,0.040886,498.405,2409.77 3303.52,0.00844488,4.84241,0.0408936,498.405,2409.77

What was expected?

log time should increase continously

How to reproduce?

50Hz logging rate after 5450sec of logging

Extra information

No response

mliberty1 commented 5 days ago

I will set up a test to see if I can duplicate this. Does this happen in a repeatable manner on your computer?

mliberty1 commented 5 days ago

I just collected a capture for 18096 seconds, and it looks good:

plot

Here is the analyzer.py source code:

#!/usr/bin/env python3
# Copyright 2024 Jetperch LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import matplotlib.pyplot as plt
import numpy as np
import os

def get_parser():
    p = argparse.ArgumentParser(
        description='Analyze Joulescope statistics capture for "jumps".')
    p.add_argument('--plot',
                   action='store_true',
                   help='display a plot of the time.')
    p.add_argument('path',
                   help='The path to the CSV statistics file.')
    return p

def run(args=None):
    parser = get_parser()
    args = parser.parse_args(args=args)
    d = np.loadtxt(args.path, delimiter=',')
    t = d[:, 0]
    dt = np.diff(t)
    p = np.median(dt[:100])
    k = np.where(np.logical_or(dt <= 0, dt > (p * 2.1)))[0]

    if len(k):
        for i in k:
            print(f'{i} {dt[i]}')
    else:
        print('All good!')

    if args.plot:
        f = plt.figure(figsize=[6.0, 8.0], layout='tight')
        ax1 = f.add_subplot(2, 1, 1)
        ax1.set_title('Time')
        ax1.set_xlabel('Sample')
        ax1.set_ylabel('Time (s)')
        ax1.grid(True)
        ax1.plot(t)

        ax2 = f.add_subplot(2, 1, 2, sharex=ax1)
        ax2.set_title('Time Delta')
        ax2.set_xlabel('Sample')
        ax2.set_ylabel('Δt (s)')
        ax2.grid(True)
        ax2.plot(dt)

        plt.show()

if __name__ == '__main__':
    run()

which I ran with:

python .\analyze.py 50hz_01.csv --plot

I will try again with an overnight statistics recording.

Are you able to duplicate this issue on your computer? If so, can you use HelpReport issue in the Joulescope UI to report this issue? The logs may have some hints.

dschiedsch commented 5 days ago

Hi I will try to rerun the capture and see if I can recreate it

mliberty1 commented 4 days ago

The overnight run was good too: 50hz_02

I do see that the time sync algorithm made a pretty major time correction about 9600 seconds in. It then undid this correction at 11000 seconds in. A more normal correction occurred at 43000 seconds in. I need to investigate what happened here. We do expect up ±20 ppm (±1.7 seconds / day) of drift that needs to be corrected. However, correcting one way and then back is not desirable, but this could also have been the PC's NTP time correction...

I will set up another capture to run over the weekend.

mliberty1 commented 3 days ago

I ran multiple 50 Hz statistics captures last night, and I caught one issue:

image

I also edited the script in the previous post to correctly process this failed case.

The logs show an uneventful night from when I started CSV statistics logging to when I ended it:

INFO:2024-11-22 16:56:19,638:statistics_record.py:43:joulescope_ui.widgets.statistics_record.statistics_record.obj:JLS record registry/JS220-000415/events/statistics/!data to C:\Users\mliberty\Documents\joulescope\20241122_215619-JS220-000415.csv
INFO:2024-11-22 16:56:19,638:paths.py:122:joulescope_ui.paths:mru_save C:\Users\mliberty\Documents\joulescope\20241122_215619-JS220-000415.csv
INFO:2024-11-22 16:56:19,638:pubsub.py:1092:joulescope_ui.pubsub:register(unique_id=StatisticsRecord:0000000b, obj=<joulescope_ui.widgets.statistics_record.statistics_record.StatisticsRecord object at 0x000001C3159D3A50>) start
INFO:2024-11-22 16:56:19,638:pubsub.py:905:joulescope_ui.pubsub:Missing metadata for registry/app/children
INFO:2024-11-22 16:56:19,638:pubsub.py:1143:joulescope_ui.pubsub:register(unique_id=StatisticsRecord:0000000b) done 
INFO:2024-11-22 21:58:18,141:main.py:458:joulescope_ui.main:on_action_key_press: (16777216, <KeyboardModifier.NoModifier: 0>)
INFO:2024-11-23 07:45:37,604:statistics_record.py:131:joulescope_ui.widgets.statistics_record.statistics_record.cls:stop
INFO:2024-11-23 07:45:37,604:statistics_record.py:107:joulescope_ui.widgets.statistics_record.statistics_record.obj:stop

The biggest lead here is the time alignment algorithm. It's possible that the new time alignment algorithm in UI 1.2.1 with FW 1.3.0 and FPGA 1.3.0 is to blame. However, I have an older, still unsolved report from September with strange timestamps in a JLS file. It could be that this is simply a relatively rare event.

We will investigate. Given the difficulty in duplicating this issue, any fix will likely take some time.

mliberty1 commented 1 day ago

I believe that we have identified the problem. It is a somewhat rare event in the time synchronization implementation. We just released Joulescope UI 1.2.2, which includes the FPGA 1.3.1 gateware that fixes this issue. Can you upgrade and see if the issue persists? Also, be sure to update your JS220 when prompted with the latest code.

We found that the time synchronization code was not performing a time sync request roughly every 10 seconds as designed. Due to some bad conditional logic in the code, it was updating roughly at 10, 20, or 30 seconds. The calculations are performed using fixed-point math. When the duration over a combined 42 updates was too long, this was overflowing the calculation. FPGA 1.3.1 fixes the conditional logic to always update every 10 seconds as intended and keep the calculation within the intended range.

dschiedsch commented 1 day ago

Thx for looking into this. I will only be able to test next week.

mliberty1 commented 20 hours ago

We performed overnight testing on several computers, and analyze.py script above indicated that they are all good. I am going to close this as fixed in UI 1.2.2 with JS220 FPGA 1.3.1.