opencomputeproject / Time-Appliance-Project

Develop an end-to-end hypothetical reference model, network architectures, precision time tools, performance objectives and the methods to distribute, operate, monitor time synchronization within data center and much more...
MIT License
1.32k stars 101 forks source link

wrong UTC Offset to TAI during startup with NEO-M9N #97

Closed JohnHay closed 10 months ago

JohnHay commented 1 year ago

I'm using the NEO-M9N board on a v8 card. I have tried both the closed source (0x1A) and OS (0x06) FPGA firmware. The M9N is configured for 115k baud and UBX output (NAV_TIME_UTC, NAV_TIME_LS, NAV_STATUS, MON_HW and NAV_SAT).

During the first few minutes after a cold boot the Tod SlaveUtcStatus register will often go to: 0x10133 (51s UTC to TAI) 0x10112 (18s UTC to TAI)

before settling on: 0x10125 (37s UTC to TAI)

It looks like the M9N will always/mostly report the currLs as 0x12 (18 decimal) even when the srcOfCurrLs is 1 (Glonass) 2 (GPS) 4 (Beidou) 5 (Galileo) 7 (Configured) 255 (Unknown)

It looks like the code in TodSlave.vhd expects currLs to be different for different values of srcOfCurrLs, but that is not the case NEO-M9N firmware (protocol version 32.01).

It is only a "problem" during startup, but it can last for quite a few minutes, but during that time the UTC_INFO_VALID bit (8) is set, so the clock is set.

Attached is the capture of the NAV_TIME_LS messages during 2 cold starts. ubx-out-ls.txt

PS. Is there a reason the OS firmware does not also default to UBX and 115k baud, like the the closed firmware? It defaults to NMEA and 19k baud.

Regards

John

JohnHay commented 1 year ago

Here is the output reading the TimeCard registers once a second during a coldstart. The lines have a timestamp of the card, the OS, and some of the status registers.

In this case the UTC-TAI went through 37,18,37,51,37.

tc4-tai.txt

JohnHay commented 12 months ago

Maybe I should ask it in another way.

Can anyone with a Ublox GNSS receiver other than a M9N check if it ever has a value other than 18 in currLs of the UBX-NAV-TIMELS message? (While validCurrLs is set.) During a cold boot or whenever it has not locked onto the GPS constellation yet, so the srcOfCurrLs is set to something other than GPS(2). (Say on a ZED-F9T for instance. )

I had a look at the documentation for both the M9N and F9T. For both in the UBX-NAV-TIMELS section (M9-SPG 3.15.19, and ZED-F9T-10B 3.15.21), the description of the CurrLs field is:

Current number of leap seconds since start of GPS time (Jan 6, 1980). It reflects how much GPS time is ahead of UTC time. Galileo number of leap seconds is the same as GPS. BeiDou number of leap seconds is 14 less than GPS. GLONASS follows UTC time, so no leap seconds.

The way I read it is that CurrLs = GPS - UTC They say GPS and not GNSS or the one that is currently srcOfCurrLs. They then go on to describe the relationship of the other time bases, but that is extra information and they do not say those will ever be put in CurrLs.

Regards

John

lasselj commented 12 months ago

Hi John,

I would like to help you, and I am writing UBX code right now for the LEA-M8F and the MAX-M10S in addition to the NEO-M9N so I am ideally placed to do so, but can I ask first please:

In any event, if you want, I can test with the two module also listed above. In the instance, can you describe in significant detail (UBX instructions you issue for the NAV stuff) the exact steps to reproduce this problem please?

All the best,

Lasse

On 29 Aug 2023, at 16:56, JohnHay @.***> wrote:

Maybe I should ask it in another way.

Can anyone with a Ublox GNSS receiver other than a M9N check if it ever has a value other than 18 in currLs of the UBX-NAV-TIMELS message? (While validCurrLs is set.) During a cold boot or whenever it has not locked onto the GPS constellation yet, so the srcOfCurrLs is set to something other than GPS(2). (Say on a ZED-F9T for instance. )

I had a look at the documentation for both the M9N and F9T. For both in the UBX-NAV-TIMELS section (M9-SPG 3.15.19, and ZED-F9T-10B 3.15.21), the description of the CurrLs field is:

Current number of leap seconds since start of GPS time (Jan 6, 1980). It reflects how much GPS time is ahead of UTC time. Galileo number of leap seconds is the same as GPS. BeiDou number of leap seconds is 14 less than GPS. GLONASS follows UTC time, so no leap seconds.

The way I read it is that CurrLs = GPS - UTC They say GPS and not GNSS or the one that is currently srcOfCurrLs. They then go on to describe the relationship of the other time bases, but that is extra information and they do not say those will ever be put in CurrLs.

Regards

John

— Reply to this email directly, view it on GitHub https://github.com/opencomputeproject/Time-Appliance-Project/issues/97#issuecomment-1697728638, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCEZECB6FGJJJ4VSEJUUATXXYGKHANCNFSM6AAAAAA3XMRE3E. You are receiving this because you are subscribed to this thread.

JohnHay commented 12 months ago

Hi Lasse,

For the TimeCard, the firmware on the FPGA implement a counter that is synchronized to TAI by the TOD (Time of Day) module also on the the FPGA. The TOD module listens the GNSS receiver and parse the messages. The message used to find the TAI offset is the UBX-NAV-TIMELS message. To determine TAI, 3 fields in the message is used, validCurrLs that is 1 when the other fields are valid, currLs the number of leap seconds, and srcOfCurrLs that indicate the source of the information.

The firmware code to implement this is in this block.

What I observed was that during the first 10-12 minutes after power on, the clock would jump around, like I explained in the 1st and 2nd messages. So I used ubxtool to look at the UBX-NAV-TIMELS messages and saw that the currLs field stayed 18 the whole time, but srcOfCurrLs changed as it searched the sky and found different GNSS constellations, until it finally synced to GPS. Time jumps happened even after the validCurrLs bit was set. The current offset between UTC and GPS time is 18 seconds. GPS time is fixed at 19 seconds ahead of TAI, so TAI to UTC is 37 seconds.

Looking at the code in the above link, I noticed that depending on the value of srcOfCurrLs, different values are added to currLs to calculate the TAI-UTC offset. But for the M9Ns I have, that is wrong because they always report 18 not matter what their srcOfCurrLs value at that stage.

So what I'm trying to find out, is if the firmware was written with the wrong assumption or if different Ublox receivers/firmware differ in how they fill currLs in.

To test, you can disable the NMEA messages and enable UBX-NAV-TIMELS and then do a cold boot and capture the messages. With ubxtool I enabled it with -z CFG-MSGOUT-UBX_NAV_TIMELS_UART1,1.

Regards

John

JohnHay commented 12 months ago

@julianstj1 I see you committed test scripts recently that check for the TIMELS message. Any chance that you could capture the TIMELS mesages for the first 10-12 minutes after the cold boot of one of the GNSS receivers?

While I focused on the startup where I saw the TAI jumps, one could also see it later if ever the GPS constellation is not available, while some of the other GNSS constellations are. While unlikely, it could occur because of jamming or whatever.

julianstj1 commented 12 months ago

Hi John, those test scripts were just simple manufacturing test scripts, to make sure modules are installed and all the connections are working. It's possible to edit the script to capture or watch for anything you particularly want.

JohnHay commented 11 months ago

Hi Julian, currently it looks like the TIMELS message M9N receivers sends, fill the currLs field differently from what the Xilinx firmware expects. What I'm trying to find out is how other ublox receivers does it, so that we can determine if the M9N is an exception or if the Xilinx firmware needs to change for all of them.

Unfortunately I only have M9Ns. For me the easiest was to force a cold boot of the receiver and log all the TIMELS messages and then look through the ones where validCurrLs is set. On the M9N receivers (with the SPG 4.04 firmware they came with) currLs is always 18 and it does not matter what srcOfCurrLs is set to.

To calculate TAI the Xilinx firmware in this section will add different values depending on what srcOfCurrLs is. When using a M9N receiver that makes the TAI offset register and the clock counter jump around.

Like I said in my 3rd message, my reading of the ublox documentation is that currLs is GPS time minus UTC time and not GNSS time minus UTC like the current Xilinx firmware implements.

thschaub commented 11 months ago

Maybe I should ask it in another way.

Can anyone with a Ublox GNSS receiver other than a M9N check if it ever has a value other than 18 in currLs of the UBX-NAV-TIMELS message? (While validCurrLs is set.) During a cold boot or whenever it has not locked onto the GPS constellation yet, so the srcOfCurrLs is set to something other than GPS(2). (Say on a ZED-F9T for instance. )

I had a look at the documentation for both the M9N and F9T. For both in the UBX-NAV-TIMELS section (M9-SPG 3.15.19, and ZED-F9T-10B 3.15.21), the description of the CurrLs field is:

Current number of leap seconds since start of GPS time (Jan 6, 1980). It reflects how much GPS time is ahead of UTC time. Galileo number of leap seconds is the same as GPS. BeiDou number of leap seconds is 14 less than GPS. GLONASS follows UTC time, so no leap seconds.

The way I read it is that CurrLs = GPS - UTC They say GPS and not GNSS or the one that is currently srcOfCurrLs. They then go on to describe the relationship of the other time bases, but that is extra information and they do not say those will ever be put in CurrLs.

Regards

John

Hi John,

thanks for this good catch. The sentence above about the CurrLs was a bit misleading. You are right it never goes into CurrLs. I have quickly checked it when modules (F9T and the M9N) uses only Beidou.

09:08:34  0000  B5 62 01 26 18 00 A0 94 69 11 00 00 00 00 04 12  µb.&.. .i.......
          0010  04 00 6E 4A A1 FC 89 08 07 00 00 00 00 03 F7 F4  ..nJ¡ü........÷ô.

CurrLs is always 12s. We will provide a fix.

Thanks again Thomas

JohnHay commented 11 months ago

Hi Thomas,

Thank you.

Regards

John

thschaub commented 10 months ago

You can find the fixed UTC correction for BeiDou in the following commit: https://github.com/opencomputeproject/Time-Appliance-Project/commit/a330ec4c33a1a760dd3cbf221079e561027fcce5

SOM version not fixed yet.

JohnHay commented 10 months ago

Hi @thschaub, is the else case really correct? It seems like currLs is always the number of leap seconds since the start of GPS time.

    if ((UtcOffsetInfo_DatReg.SrcOfCurLeapSecond = x"01") or (UtcOffsetInfo_DatReg.SrcOfCurLeapSecond = x"02") or (UtcOffsetInfo_DatReg.SrcOfCurLeapSecond = x"04") or (UtcOffsetInfo_DatReg.SrcOfCurLeapSecond = x"05")) then -- GPS, or derived from dif GPS to Glonass, or Beidou or Galileo
        UtcOffsetInfo_DatReg.CurrentUtcOffset <= std_logic_vector(unsigned(MsgData_DatReg) + 19); -- add the GPS-TAI offset to UTC-GPS offset
        UtcOffsetInfo_DatReg.CurrentTaiGnssOffset <= std_logic_vector(unsigned(MsgData_DatReg) + 19); -- add the GPS-TAI offset to UTC-GPS offset
    else -- else assume that the utc offset is provided directly
        UtcOffsetInfo_DatReg.CurrentUtcOffset <= MsgData_DatReg;
        UtcOffsetInfo_DatReg.CurrentTaiGnssOffset <= MsgData_DatReg;
    end if;

The M9N's that I have report 18 (0x12) for currLs even when srcOfCurrLs is 7 (Configured) and 255 (Unknown) during a cold start. That is without me setting it, so it comes from the firmware settings. Their validCurrLs bit is not set srcOfCurrLs is 7, but it was set when it was 255, but it does seem that the whole "if else end if" can be replaced with just adding 19 to the currLs value:

UtcOffsetInfo_DatReg.CurrentUtcOffset <= std_logic_vector(unsigned(MsgData_DatReg) + 19); -- add the GPS-TAI offset to UTC-GPS offset
UtcOffsetInfo_DatReg.CurrentTaiGnssOffset <= std_logic_vector(unsigned(MsgData_DatReg) + 19); -- add the GPS-TAI offset to UTC-GPS offset

You can look at the TIMELS bytes captured in the file ubx-out-ls.txt in my first post. On the lines, byte 14 is srcOfCurrLs, byte 15 is currLs, and byte 29 is where the valid flags are. The last 2 bytes are the checksum.

Or is there a case when that is not correct?

Regards

John

thschaub commented 10 months ago

Hi @JohnHay

Sorry, I didn't checked the point about srfOfCurrLs = 255 which seems to be a valid leap second value according to that what the M9N is reporting. 255 seems to be somehow a transient state, I am not sure what is the best approach.

Byte 29 = 1 is already a valid for the currLs which seems to be set in you case as well when srcOfCurrLs =1

thschaub commented 10 months ago

Reading the datasheet I have in general the feeling all values except 0, are valid sources for CurrLs. So maybe you are right with your comment, we should be not picky and take them all to update the Offsets. The valid is treated anyway independently.

JohnHay commented 10 months ago

That is also the conclusion I came to after reading the datasheets and looking at the output of the M9Ns I have.

thschaub commented 10 months ago

Now also 255 is taken as a valid source. Additionally, if the source is non of the matching ones, the last valid value is retained. The valid is still handled as before, so directly taken from the receiver. https://github.com/opencomputeproject/Time-Appliance-Project/commit/fbf97469ea892694f82ad5cbba23b8f2b5b34aa2

JohnHay commented 10 months ago

Thanks, so far it looks good.

JohnHay commented 10 months ago

It took some time and few tries before I got the srfOfCurrLs to go: Configured(7), Glonass(1), Unknown(255), Galileo(5) and then GPS(2). The validCurrLs flag was set from Glonass and there were no jumps in time.

I'm happy, thank you!