Avnu / OpenAvnu

OpenAvnu - an Avnu sponsored repository for Time Sensitive Network (TSN and AVB) technology
468 stars 288 forks source link

gptp automotive profile resetting to default values for ml_phoffset and ml_freqoffset in the middle of operation #737

Open vasuhere opened 6 years ago

vasuhere commented 6 years ago

Hi,

I am using simple talker and simple listener with gptp in automotive profile. I am forcing gptp on talker to be the master (-V -T -GM) and forcing gptp on listener to be slave (-V -L).

In both with and without enabling automotive test mode (-E), i observed that after some time gptp slave is resetting ml_phoffset to "0" and ml_freqoffset to "1.000000".

Can someone please help me understand why such resetting is happening?

BR, Srinivas.

vasuhere commented 6 years ago

This reset of master data on slave is happening on "SYNC_INTERVAL_TIMEOUT_EXPIRES" and a sync message is being sent from slave. Isn't SYNC supposed to be from master to slave? Why is a SYNC message being sent from slave ?

pinealservo commented 6 years ago

That sounds like it could be a bug, but it would help narrow down what might be causing it if you could provide some more information:

  1. Which branch and commit did you build?
  2. What devices are you using as endpoints? Is there a switch involved, and if so what kind is it?
  3. Do you have some log output captured?
vasuhere commented 6 years ago

Hi,

  1. I am have taken code from master branch
  2. I am using Intel I210 NIC on PC1 as talker (gptp master, automotive profile) and Intel 82579LM NIC on PC2 as listener (gptp slave, automotive profile). PC1 and PC2 are directly connected using an ethernet cable.
  3. Please find attached logs taken for gptp slave gptp_slave_log.txt
PawelModrzejewski commented 6 years ago

Could you also check (wireshark, tcpdump) if your slave sends out Sync/Follow_up messages?

Update: now I see you've already done it: "... and a sync message is being sent from slave."

vasuhere commented 6 years ago

From the attached log, the below signals are received at slave.

STATUS : GPTP [14:27:02:058] Signalling Link Delay Interval: -128 STATUS : GPTP [14:27:02:058] Signalling Sync Interval: 0 STATUS : GPTP [14:27:02:058] Signalling Announce Interval: -128

Once this message is received at slave, code at ptp_message.cpp, PTPMessageSignalling::ProcessMessage function, seems to be setting syncInterval as well, with out checking if the port is in master/slave mode.

tlv.getTimeSyncInterval() is assiging "timeSyncInterval" to "0", which is setting "startSyncIntervalTimer(waitTime)", with "waitTime" as 1 second, which can be observed from the logs that after 1 second, SYNC_INTERVAL_TIMEOUT_EXPIRES event is triggered.

pinealservo commented 6 years ago

I think you're right; because the signalling message is received, it has set the SYNC interval to 1 second and it doesn't check if it's the GM before setting the SYNC timer. And when the SYNC timer expires, it doesn't check whether it's the GM before sending a SYNC.

Apparently, both when we send or receive a SYNC, we call clock->calcLocalSystemClockRateDifference and clock->setMasterOffset; the calcLocalSYstemClockRateDifference is what is printing the STATUS messages with the clock offset, rate ratio, etc. which I assume are also being written to the shared memory area.

So I would guess, based on the debug output, that you'll see the shared memory stuff oscillate between what it should be holding as a gPTP slave (if it just received a SYNC) and what it would hold as a gPTP grandmaster (if it just sent a SYNC).

I think that a check in the handler for signalling messages before setting the SYNC interval timer would fix the problem. If you would like to try fixing it, that would be great; otherwise we'll get something together soon.

vasuhere commented 6 years ago

Thanks a lot @pinealservo for the analysis and suggesting a fix.

I have made the below fix and it seems to work fine. I have included the whole logic of setting Sync interval in PTPMessageSignalling::ProcessMessage, in side

if(PTP_MASTER == port->getPortState()) {
}

One more issue that is observed is, the signals mentioned in the above log are supposed to be from a slave to master, but here the master is sending them to the slave. in ether_port.cpp, the logic for sending these signals seems to be fine, as they are sending these signals if the port is not GM (if(!isGM))

But the issue here is, port->isGM is not being initialized to true. class EtherPort is inheriting from class CommonPort. there is an isGM variable in both EtherPort and CommonPort classes. the isGM variable inside CommonPort is being set during construction but not the one in EtherPort.

This looks like a problem. I think EtherPort would like to make use of isGM variable from CommonPort class. As a temporary fix I removed isGM variable from EtherPort class. This seems to work fine.