wowczarek / ptpd

Wojciech's ptpd workspace
Other
18 stars 11 forks source link

Changing active slave in bond causes high cpu usage #10

Closed mszicsu closed 6 years ago

mszicsu commented 6 years ago

Hi Wojciech,

Please, take a look if you get a chance. I tested this fork with and without HW support. HW works fine, but without HW when I change active slave in bond0 it stucks in ussing high CPU on the selected core. OS: Debian 9.2, kernel: 4.9.0-4-amd64

cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: pri0 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0

Slave Interface: pri0 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 24:b6:fd:87:2d:1f Slave queue ID: 0

Slave Interface: sec0 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 24:b6:fd:87:2d:20 Slave queue ID: 0

2017-12-04 05:00:19.124103 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.124115 ptpd[2970].bond0 (debug1) (slv) netRecvEvent: Flushed errqueue 2017-12-04 05:00:19.126336 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.126349 ptpd[2970].bond0 (debug1) (slv) ==> DelayResp message received, sequence 309 2017-12-04 05:00:19.126359 ptpd[2970].bond0 (debug1) (slv) ==> Handle DelayResp (309) 2017-12-04 05:00:19.126370 ptpd[2970].bond0 (debug1) (slv) Ignored DelayResp sequence 309 - wasn't waiting for one 2017-12-04 05:00:19.142836 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.142850 ptpd[2970].bond0 (debug1) (slv) ==> Sync message received, sequence 26307 2017-12-04 05:00:19.142862 ptpd[2970].bond0 (debug1) (slv) UpdateOffset: max delay hit: 0 2017-12-04 05:00:19.157694 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.157710 ptpd[2970].bond0 (debug1) (slv) Ignored message Sync received from 0 domain 2017-12-04 05:00:19.174081 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.174097 ptpd[2970].bond0 (debug1) (slv) ==> Sync message received, sequence 26308 2017-12-04 05:00:19.174109 ptpd[2970].bond0 (debug1) (slv) UpdateOffset: max delay hit: 0 2017-12-04 05:00:19.174120 ptpd[2970].bond0 (debug1) (slv) tuning outlier filter: delayMS 2017-12-04 05:00:19.174129 ptpd[2970].bond0 (debug1) (slv) delayMS filter autotune: counter 10, samples 10, outliers, 0 percentage 0, new threshold 1.50 2017-12-04 05:00:19.189049 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.189065 ptpd[2970].bond0 (debug1) (slv) Ignored message Sync received from 0 domain 2017-12-04 05:00:19.205329 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.205343 ptpd[2970].bond0 (debug1) (slv) ==> Sync message received, sequence 26309 2017-12-04 05:00:19.205355 ptpd[2970].bond0 (debug1) (slv) UpdateOffset: max delay hit: 0 2017-12-04 05:00:19.209316 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.209332 ptpd[2970].bond0 (debug1) (slv) ==> DelayReq message received, sequence 49305 2017-12-04 05:00:19.209342 ptpd[2970].bond0 (debug1) (slv) delayReq message received : 2017-12-04 05:00:19.209352 ptpd[2970].bond0 (debug1) (slv) ==> DelayResp message received, sequence 49305 2017-12-04 05:00:19.209362 ptpd[2970].bond0 (debug1) (slv) HandledelayResp : delayResp doesn't match with the delayReq. 2017-12-04 05:00:19.220408 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.220423 ptpd[2970].bond0 (debug1) (slv) Ignored message Sync received from 0 domain

echo 'sec0' > /sys/class/net/bond0/bonding/active_slave

2017-12-04 05:00:19.227626 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227639 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227651 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227661 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227674 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227690 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227702 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227712 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227724 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227734 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227745 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227755 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227766 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227776 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227787 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227797 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227808 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227818 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227829 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227839 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227850 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227860 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length 2017-12-04 05:00:19.227871 ptpd[2970].bond0 (debug1) (slv) handle: something 2017-12-04 05:00:19.227881 ptpd[2970].bond0 (debug1) (slv) Error: message shorter than header length

wowczarek commented 6 years ago

Hi,

This could be to do with Linux software TX timestamping and a TX timestamp being stuck in the socket error queue. The last commit from today is a potential fix.

You could alternatively force it to use the standard socket transport, since you are using software timestamping:

ptpengine:transport_implementation=socket_udpv4

To see all available transports, use:

ptpd -e ptpengine:transport_implementation

You could also try the libpcap transport (pcap_udpv4) if you have the libpcap dev package installed such as libpcap-dev, pcap-devel etc.

mszicsu commented 6 years ago

Hi,

Thanks for the answer, I tried the fix, but it said "No full SO_TIMESTAMPING software timestamp support" and fell back to udp socket transport which of course works.

Does this mean that the following combination is not supported? ptpengine:interface = bond0 ptpengine:transport_implementation = linuxts_udpv4 ptpengine:hardware_timestamping = n

2018-03-12 10:43:57.740095 ptpd[32609].(info) transport: detectTTransport(bond0): probing for usable IPv4 transports 2018-03-12 10:43:57.740171 ptpd[32609].(error) testLinuxTimestamping(bond0): No full SO_TIMESTAMPING software timestamp support 2018-03-12 10:43:57.740188 ptpd[32609].(info) transport: detectTTransport(bond0): selected 'socket_udpv4' 2018-03-12 10:43:57.740901 ptpd[32609].(notice) ttransport.socket_udpv4: Transport 'Event' (bond0) started 2018-03-12 10:43:57.741011 ptpd[32609].(notice) ttransport.socket_udpv4: Transport 'General' (bond0) started

Thanks, Sz.

wowczarek commented 6 years ago

Hi,

Looks like this configuration is not supported in your case. Likely the NIC driver does not fully implement SO_TIMESTAMPING in software. The only difference between this and the regular sockets, is transmission timestamps. It just so happens that the linuxts transports handle both hardware and software timestamps with SO_TIMESTAMPING. The hardware_timestamping=n option controls this. But since software timestamps are what you are after, you can just use the socket_udpv4 transport (it uses the "best" socket timestamp option available i.e. SO_TIMESTAMP or SO_TIMESTAMPNS). If you are using multicast, then the TX timestamps are acquired by looping multicast into the socket (it's a socket option). For unicast, ptpd manually "loops" the packets by sending them to itself, to estimate the TX timestamp. Not great, but works. Otherwise, the libpcap-based transports are a good way to go about it.

Thanks, Wojciech

mszicsu commented 6 years ago

Hi Wojciech,

First of all thanks for dealing so long with my case. The card I used for testing and reproducing this issue was quite a widespread one (Intel X520). According to ethtool it supports SW timestamping.

04:00.0 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Backplane Connection (rev 01)

ethtool -T pri0 Time stamping parameters for pri0: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 0 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)

IXGBE driver version is 5.2.4.

I took a quick experiment now without bonding just to see: ptpengine:interface = pri0 ptpengine:transport_implementation = linuxts_udpv4 ptpengine:hardware_timestamping = n

With the combination above "detectTTransport" seemingly can detect SOF_TIMESTAMPING, but after started up it continuously receives empty PTP messages.

2018-03-13 11:30:55.049693 ptpd[2652].(info) transport: detectTTransport(pri0): selected 'linuxts_udpv4' 2018-03-13 11:30:55.049894 ptpd[2652].(info) initTimestamping(pri0): Linux software timestamping enabled 2018-03-13 11:30:55.050186 ptpd[2652].(notice) ttransport.linuxts_udpv4: Transport 'Event' (pri0) started 2018-03-13 11:30:55.050307 ptpd[2652].(notice) ttransport.linuxts_udpv4: Transport 'General' (pri0) started 2018-03-13 11:30:55.050325 ptpd[2652].(debug1) clock: Clock driver syst configured 2018-03-13 11:30:55.050345 ptpd[2652].(debug1) clock: Clock driver syst configured 2018-03-13 11:30:55.050355 ptpd[2652].(debug1) clock: Clock driver syst configured 2018-03-13 11:30:55.050365 ptpd[2652].(debug1) state PTP_INITIALIZING 2018-03-13 11:30:55.050372 ptpd[2652].(debug1) manufacturerOUI: ff:ff:ff 2018-03-13 11:30:55.050378 ptpd[2652].(debug1) initData 2018-03-13 11:30:55.050387 ptpd[2652].(debug1) setPtpClockIdentity() called 2018-03-13 11:30:55.050394 ptpd[2652].(debug1) initClock 2018-03-13 11:30:55.050550 ptpd[2652].(debug1) State change request from PTP_POWERUP to PTP_INITIALIZING 2018-03-13 11:30:55.050569 ptpd[2652].(notice) Now in state: PTP_INITIALIZING 2018-03-13 11:30:55.261294 ptpd[2652].(debug1) state PTP_LISTENING 2018-03-13 11:30:55.261327 ptpd[2652].(debug1) Timer ANNOUNCE_RECEIPT started at 12.000 2018-03-13 11:30:55.261337 ptpd[2652].(debug1) State change request from PTP_POWERUP to PTP_LISTENING 2018-03-13 11:30:55.261358 ptpd[2652].(notice) Now in state: PTP_LISTENING 2018-03-13 11:30:56.076815 ptpd[2652].(debug1) Received an empty PTP message (this should not happen) - ignoring 2018-03-13 11:30:56.080547 ptpd[2652].(debug1) Timer STATUSFILE_UPDATE expired 2018-03-13 11:30:56.080588 ptpd[2652].(debug1) Timer STATUSFILE_UPDATE started at 1.000 2018-03-13 11:30:56.107662 ptpd[2652].(debug1) Timer ALARM_UPDATE expired 2018-03-13 11:30:56.107674 ptpd[2652].(debug1) updateAlarms() 2018-03-13 11:30:56.458920 ptpd[2652].(debug1) Received an empty PTP message (this should not happen) - ignoring 2018-03-13 11:30:56.458940 ptpd[2652].(debug1) Ignored message DelayResp received from 1 domain 2018-03-13 11:30:56.468275 ptpd[2652].(debug1) Received an empty PTP message (this should not happen) - ignoring 2018-03-13 11:30:56.471253 ptpd[2652].(debug1) Received an empty PTP message (this should not happen) - ignoring 2018-03-13 11:30:56.471268 ptpd[2652].(debug1) Ignored message DelayResp received from 1 domain 2018-03-13 11:30:56.474914 ptpd[2652].(debug1) Received an empty PTP message (this should not happen) - ignoring 2018-03-13 11:30:56.487185 ptpd[2652].(debug1) Received an empty PTP message (this should not happen) - ignoring

Anyway, thanks. Szilard

wowczarek commented 6 years ago

OK, I see what is happening:

So basically it is an expected error. However the socket_udpv4 transport still works OK, so I don't think there really is an issue. You just end up using a different transport - as far as performance is concerned, this only has any importance when you are using unicast, because of the following:

The only "special" thing the SOF_TIMESTAMPING_xx_SOFTWARE capabilities add over standard SO_TIMESTAMPNS is delivering packet transmission timestamps. When using the socket_xxx transports and multicast, the TX timestamps are instead acquired using SO_MULTICAST_LOOP - socket option that loops sent mcast packets back into the socket. This works very similarly to software TX timestamps. Only for unicast the TX timestamps are manually copied into the socket, which is not great but still delivers. Otherwise you can build ptpd with libpcap and use the pcap_xxx transports for software timestamping.

Do you have a specific need to use software timestamps vs. hardware timestamps? PTPd will still sync the system clock to the NIC for you. Using h/w timestamping is almost always a better option if available.

Thanks, Wojciech

mszicsu commented 6 years ago

Hi Wojciech,

The background is majority of the NICs supported HW and I was really satisfied with hwtest branch (still using it). But some of old NICs had problem with HW support and I switched them to SW timestamping using the same hwtest branch, where I noticed the above mentioned high CPU usage after I changed active slave in bond.

Then I saw you had a possible correction and I just would have liked to try and confirm it is indeed a resolution to the problem, when I noticed active development moved to libcck branch.

Unfortunately, in the meantime I swapped those old cards out, so I was not able to try libcck on them, which was the only reason I tried using SW timestamping on a card with HW capability just to reproduce the issue with libcck as well, what I noticed originally with hwtest.

But I failed reproducing the "same" setup with libcck what I had with hwtest most likely due to my IXGBE driver. I really do not want to waste your time, I am more than satisfied with decent accuracy and stability of your HW feature. And if I ever will need SW somewhere once again the socket timestamping is also going to be fine as I use multicast.

Thanks again for your help. Szilard