raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.03k stars 4.95k forks source link

Raspberry Pi 3 built-in WiFi SSH NTP Problem #1519

Open Cocacola321 opened 8 years ago

Cocacola321 commented 8 years ago

I have a problem with NTP and built-in WiFi on Raspberry Pi3.

Under Rasbian the following command solve the problem with the builtin Wifi: /sbin/iptables -t mangle -I POSTROUTING 1 -o wlan0 -p udp --dport 123 -j TOS --set-tos 0x00

If placed in rc.local it is automatically executed at boot.

But under libreelec, this command doesn't work, the file rc.local doesn't exists and I don't know how to make the equivalent adjustment under libreelec.

Could anybody help me?

pelwell commented 8 years ago

This would have been a good report if you'd linked to the Forum thread that properly describes the problem (https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=141454) and left out the bit about LibreElec.

Cocacola321 commented 8 years ago

Thanks for your feedback but how can I fix the problem under libreelec because the commands that mentioned in the thread don't work under libreelec?

MilhouseVH commented 8 years ago

@Cocacola321: The suggested fix (iptables) is a workaround and not really what anyone would consider a suitable fix for all users. The fact it's not possible to apply the workaround with LE underlines this point.

Understanding the underlying problem and fixing it (maybe it's a driver issue?) is the best way forward and will benefit all distributions. Now that an issue has been opened, hopefully the RPi developers can get a handle on what might be happening. However this issue doesn't affect me when using LE and RPi3 WiFi, so it might be environment/router/access point specific (my access point is a Netgrear.DGND4000, using a Raspbian-based dnsmasq DHCP/DNS server configured with the Raspbian host 192.168.0.4 as timeserver).

If/when a fix becomes available you'll probably see it first in an LE nightly build...

Ferroin commented 8 years ago

Under DSCP standards (which is what NTP is actually setting, not TOS), 0xC3 translates to assured forwarding class 3 with high drop probability, which is pretty typical for non-critical signaling protocols like NTP.

The fact that your not getting a time lock on any of the servers indicates your losing packets somewhere. The fact that this only happens with the build-in WiFi and the 'official' dongles indicates that it's something either in their hardware, or the drivers, and given that this did work before, I'd say it's in the drivers.

Whether or not a layer 1 driver should be interpreting a layer 2 protocol and making decisions based on that interpretation is debatable, but that appears to be what's going on here, and in this case, it's a bad thing and needs to be fixed, as this behavior is not reasonable. Given the specific behavior and the fact that most other Linux networking software doesn't set a DSCP of a class higher than 2, I'd say what's happening is that the driver is not prioritizing which packets to drop based on class (most drivers drop lower class packets first, even if they have a lower drop probability tag), but is just checking the drop probability on everything and ignoring the class. I'm not sure if this goes counter to the standards, but it does go counter to how just about every other NIC driver in existence behaves.

pelwell commented 8 years ago

Thanks for your input, @Ferroin. I've opened a support issue with Broadcom to see what they say.

vrabac commented 8 years ago

I am also facing this issue on archlinuxarm by using timedatectl from systemd on my RPi3. With same config but using Ethernet ntp sync is just fine. When I unplug the cable and setup WiFi, the time could not be setup by timedatectl. I also tested with that iptables, but it was not working. I as thinking this Fritz (connected over WiFi to it) is blocking something, but as I don't have access to it, i was not able to check it, and leaved it so. But now that issue is open, it could relay be something with driver. Thanks to everyone involved!!

vrabac commented 8 years ago

So after reseting Fritz!Box and new setup of the WiFi i found that there were two WiFi SSID (normal and guest one), so seems normal was disabled before and the Guest was enabled were everything was blocked expect http. After changing to "normal" WiFi my RPi3 was able to sync time over Wireless with NTP server just fine. I am using archlinuxarm with up2date firmware/kernel!!

pelwell commented 8 years ago

@Ferroin Broadcom have come back to me asking for packet logs etc. I haven't been able to reproduce the issue myself, but this is what I've found so far:

NTP packets are sent out using a Differentiated Services Field value of 0xc0 (DSCP 0x30, Class Selector 6, ECN 0x00). A DSCP of 30 decimal (0x1e) would correspond to "assured forwarding class 3 with high drop probability", but this packet's DSCP is 48 decimal (0x30), so I don't understand your analysis.

@Cocacola321 Can you install the wireshark package (sudo apt-get update; sudo apt-get install wireshark) and run it both with and without the workaround installed? You can start it like this:

sudo wireshark -i wlan0 -p -k -f "udp port 123"

You may have to click OK a few times.

Once you've seen a few NTP messages - the capture with the workaround will include server responses, the one without presumably won't - stop the capture and save it. If you can then upload the captures somewhere - Dropbox, Google Drive, etc. - or email them to me directly (phil at raspberrypi dot org) that would be really helpful.

ghollingworth commented 8 years ago

Either that or upload the saved file to cloudshark...

:)

Gordon

On Thu, Jun 23, 2016 at 1:24 PM, Phil Elwell notifications@github.com wrote:

@Ferroin https://github.com/Ferroin Broadcom have come back to me asking for packet logs etc. I haven't been able to reproduce the issue myself, but this is what I've found so far:

NTP packets are sent out using a Differentiated Services Field value of 0xc0 (DSCP 0x30, Class Selector 6, ECN 0x00). A DSCP of 30 decimal (0x1e) would correspond to "assured forwarding class 3 with high drop probability", but this packet's DSCP is 48 decimal (0x30), so I don't understand your analysis.

@Cocacola321 https://github.com/Cocacola321 Can you install the wireshark package (sudo apt-get update; sudo apt-get install wireshark) and run it both with and without the workaround installed? You can start it like this:

sudo wireshark -i wlan0 -p -k -f "udp port 123"

You may have to click OK a few times.

Once you've seen a few NTP messages - the capture with the workaround will include server responses, the one without presumably won't - stop the capture and save it. If you can then upload the captures somewhere - Dropbox, Google Drive, etc. - or email them to me directly (phil at raspberrypi dot org) that would be really helpful.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/1519#issuecomment-228033720, or mute the thread https://github.com/notifications/unsubscribe/AB9CHJP3JwF7RFimfIAgyhCmPzxo5By7ks5qOnr8gaJpZM4IwTDL .

ksze commented 8 years ago

I applied the iptable workaround for NTP, and that works.

How about SSH? I can't find that thread about SSH. Is there a workaround iptables rule for that?

kukabu commented 8 years ago

@ksze you can set QoS for ssh via ssh_config/sshd_config. Also ntpd 4.2.8 can sets any DSCP via config.

mhdismail commented 8 years ago

Same issue here on a fresh install. After a lot of struggle, the iptables workaround fixed it.

david-a-wheeler commented 7 years ago

This is a frustrating problem, because you can't access many websites with an incorrect date time... making it harder to get help. I hope this can be fixed very soon, I suspect a large number of people are having the problem but cannot figure out how to report it since they cannot get the internet working. Good luck.

jcloudm commented 7 years ago

I have this issue manifesting itself as problems with ssh hangs after password authentication. I have solved this issue by turning on QOS in my dlink router, but am willing to help with more details. If it would help Broadcom, I am more than willing to do a wireshark capture with some instructions.

SupraJames commented 7 years ago

Same issue. Any updates? Also happy to help with packet captures. Running an up-to-date Pi 3 (Jessie) and an TP-LINK Archer C7 V2 doing Wifi. My router is running pfSense.

nicolas-legroux commented 7 years ago

I was having the same problems with my new Rasperry Pi 3 (on the WiFi, I could not get NTP to update the time, and when trying to SSH on the Pi from a remote host, I would enter the password, but then it would hang, although the SSH logs showed the the login was successful). Is there an update from Broadcom on this rather frustrating issue ?

I ended up putting the Pi on an ethernet connection, which solved the problem for me.

I could provide packet logs with Wireshark if needed.

pelwell commented 7 years ago

Packet logs capturing NTP failure over WiFi and success over Ethernet would be very useful.

nicolas-legroux commented 7 years ago

OK, I don't have physical access to the Pi right now, but I'll capture some packets during the weekend.

nicolas-legroux commented 7 years ago

OK, I captured some packets both on ethernet and wifi with Wireshark. Here's the Google Drive link to download the packets: https://drive.google.com/open?id=0B0My5XHtfw2YYVBMaDRPclRNclU

Here's what I did: I ran the following sudo wireshark -i eth0 -p -k -f "udp port 123" I captured a few packet, then I ran sudo service ntp stop sudo ntpd -g -q sudo service ntp start The 'ntpd' command ran well, and finished with a 'time slew' message.

Then I disconnected the Ethernet, rebooted the Pi, and ran: sudo wireshark -i wlan0 -p -k -f "udp port 123" I then followed the same procedure as above. Here the "ntpd" command hangs, so I kill it. Apparently in Wifi mode I am only capturing packets coming from my NTP client.

My router is a Netgear DG834G.

Hope this can help ! Let me know if you need something else.

pelwell commented 7 years ago

Thank you for those captures. Apart from "Leap" flag, which the WiFi trace has set to unknown, the two traces look basically the same except that the Ethernet trace includes responses from the servers. I ran a similar test here, with e Netgear R7000, and get traces which look the same except for the addresses and timestamps, but my WiFi trace also includes responses.

Assuming that you normally see inbound packets in WireShark (which you should), the obvious explanation for their absence in your WiFi trace is that there aren't any, and that either the outgoing packets or incoming packets have been filtered out somewhere. In the good old days you could put an ethernet hub in the upstream link from the router and hang a packet sniffer off it, but it's not so easy now that everything is switch-based.

Do you have access to a WiFi dongle you could try? Comments above suggest that might work, but I'd like to confirm it if possible.

nicolas-legroux commented 7 years ago

Hi Phil, Yes I have a WiFi dongle that I could try out. But I won't have access to the Pi for another 10 days. I'll send you the packets at that time.

rolly-ng commented 7 years ago

Hi, Just like to confirm this issue with the latest 4.8 firmware. Please have a look at my screenshot attached. Thanks, rolly ntp_fail

nicolas-legroux commented 7 years ago

Hi,

Here's an update from my side. I did the packet capturing with a WiFi dongle (a TPLINK TL-WN722N). Everything works fine, I get responses back from the servers. I ran the same commands as decribed earlier while doing the capturing ; the 'ntpd' commanbd in particular succeeds (I had to kill it on wlan0 since it hangs).

Here's a link to the packets ;

https://drive.google.com/open?id=0B0My5XHtfw2YZXM5VGw5TkRpVk0

Let me know if I can do more.

bytos commented 7 years ago

I've tried several distros and OSMC is the only one that syncs automatically to NTP servers on my Raspberry Pi 2 (at boot time).

[Ok] Started Set Time using HTTP query (first line during boot)

Also, user lrusak at LibreELEC forums said:

"I have no problems getting the correct time. ntp is done through connman not through ntpd itself.

HawkingJan commented 7 years ago

I have the same problem for my Raspberry Pi Zero W (running on Jessy Light). I set my timezone using raspi-config. But with the default settings the time/date is not wrong by several days (even after reboot, my router was running and having internet connection during the complete process). If I run

sudo service ntp stop
sudo ntpd -gq

it hangs (did not finish within 5 minutes). Even after I cancelled it (STR+c) the time was still wrong and remained wrong after a reboot.

After I did the workaround that is described in the first post and rebooted, the time was set correctly without me having to type any commands. Should I open a new ticket or should this one be changed so that it affects both the Pi 3 and the Zero W (both using the build-in wifi which I assume to be a similar chip?)

leif81 commented 7 years ago

Until there's a proper fix for this, here's a workaround I do to force my clock to be set from a network time source:

$ sudo sntp -s 0.debian.pool.ntp.org

JamesH65 commented 7 years ago

Can anyone seeing this issue please update to the latest kernel (apt-get update, apt-get upgrade) and post results here? There have been recent changes to the wireless driver that may has some relevance.

HawkingJan commented 7 years ago

I removed the workaround that was provided in the first post, successfully installed all the updates (including some for the kernel) via sudo apt-get update && sudo apt-get upgrade rebootet and run

sudo service ntp stop
sudo ntpd -gq

The second command didn't finish within a minute. Then I added the workaround again, rebootet (to activate the workaround). Then the above commands worked fine. So from my point of view the problem still exists unchanged (however, I'm running a Pi Zero W, so there might be patches that only work for Pi 3?)

cobaltdr commented 7 years ago

I can confirm I have the same problem with ntpd not being able to poll the ntp servers on a Pi Zero W using wlan0.

The sntp workaround works, but thereafter you're unable to use ntpq -p. I've not tried messing around with IPTABLES yet.

JamesH65 commented 7 years ago

I've posted a possible (a bit hacky) fix for this issue on the following issue #1342 . You will need to recompile the brcmfmac module to try this.

cobaltdr commented 7 years ago

Happy to give it a go... on the other hand, if this breaks :P I'll have to reconnect to the Pi with the other buggy dongle on wlan1 (rt8192eu) - separate issues raised. I am also dropping a line into https://github.com/raspberrypi/linux/issues/1342 for something potentially related to both.

pelwell commented 7 years ago

I think it's extremely unlikely to stop you connecting in. More likely is some subtle performance degradation, or perhaps a total loss of some obscure class of traffic, but that's what testing is for, right?

cobaltdr commented 7 years ago

As long as there is an easy way to revert.

pelwell commented 7 years ago
sudo rm /boot/.firmware_revision
sudo rpi-update
cobaltdr commented 7 years ago

Okay, I'll have to do some reading on this, there's some interesting information here, and I won't have time tonight. I'll feedback over the weekend if I get a chance, thanks.

cobaltdr commented 7 years ago

Fair play. How do I apply the fix? Apologies for not being very familiar with all this, and what would be the best way to check it has worked/not worked?

pelwell commented 7 years ago

See https://github.com/raspberrypi/linux/issues/1342#issuecomment-317044421

anohren commented 7 years ago

Updating LibreELEC to 8.1.1 beta solved this for me.

dexterdexter321 commented 6 years ago

I have exactly the same problem on my OSMC (the newest version). My router is Mikrotik 433AH.

I also tried the newest Raspbian, the same issue. Switching to LAN solves the problem. Please refer here for details.

Is it any chance to make it solved?

6by9 commented 6 years ago

https://github.com/raspberrypi/linux/commit/983cf7a23cc3f286f4c22d360387a5bac298d37c#diff-49782035431bd8ff8136668a75b456c9 is missing from the 4.13 and 4.14 branches. It appears that Cypress haven't released the firwmare which actually fixes the problem, so the workaround is still required.

dexterdexter321 commented 6 years ago

Thanks, that is workaround. Do you know when it is going to be implemented?

What about the ultimate solution?

6by9 commented 6 years ago

PRs #2272 and #2273 created. Ultimate solution is up to Cypress - we nudge them regularly but they're also investigating the mailbox issues, and the Krack issue has obviously required a fair amount of their time recently too.

dexterdexter321 commented 6 years ago

Clear.

Any chance to put that workaround into the current kernels e.g. 4.9.29?

pelwell commented 6 years ago

That commit has been in rpi-4.9.y since just after 4.9.39 in July, so it ought to be in the most recent OSMC release (but I haven't managed to confirm it).

dexterdexter321 commented 6 years ago

I will ask OSMC guys for comments regarding it.

JamesH65 commented 6 years ago

We have newer firmware from Cypress that we believe may fix this issue. Should be available under a apt update/upgrade. Can anyone try it out?

JamesH65 commented 6 years ago

This issue will be closed within 30 days unless further interactions are posted. If you wish this issue to remain open, please add a comment. A closed issue may be reopened if requested.

SupraJames commented 6 years ago

Still an issue here! I'd be happy to try out any new firmware.