Make Wake-On-LAN more reliable

porst17 commented 6 years ago

Sometimes, PCs do not wake up after a WOL magic packet has been sent.

If we ignore problems on the station side, I can see the following reasons why WOL might fail:

Sending the WOL packet to a specific IP address relies on static ARP caches. If the mapping between IP and MAC is not contained in the ARP cache, the magic packet will not be forwarded at all.
The magic packet is usually an UDP packet (to port 9 - the discard port). UDP packets can get lost or be corrupted without further notice. It is unlikely, but still possible. I've read numbers ranging from 0.01% to 5% packet loss to be normal, depending on the environment you are in. A slightly bad cable can also be the origin of packet loss. Also heavy load on the network.

The first problem can be addressed by sending the magic packet to the broadcast address 255.255.255.255 (or whatever is appropriate for the network hilbert is running in). Sending it to both, the broadcast address and previously known IP, will not hurt as well. The second problem can be addressed by sending the magic packet a couple of times.

I could also imagine hilbert-cli start_station to block (with a timeout) until the requested host answers on ping, e.g. something like this (pseudo code):

var PING_EXIT_CODE;
do
    wakeonlan -i broadcast_address mac
    wakeonlan -i ip_from_arp_cache mac
    PING_EXIT_CODE = (ping -o -t 5s hostname)
while( PING_EXIT_CODE != 0 && !GLOBAL_TIMEOUT )
if( PING_EXIT_CODE != 0 || GLOBAL_TIMEOUT )
    return FAILURE
else
    return SUCCESS

porst17 commented 6 years ago

I just checked the sources. Looks like wakeonlan currently sends to the broadcast address in "normal" mode, and to a specific IP in "pedantic" mode.

malex984 commented 6 years ago

@porst17 it should send WOL with IP even not in pedantic mode AFAIR...

porst17 commented 6 years ago

Sorry, my bad. There was just a comment about pedantic mode nearby. hilbert-cli is always sending WOL packets to the broadcast address and if the IP is present, it also sends to the IP.

So I would guess, that the issue is more related to WOL packets being UDP packets.

malex984 commented 6 years ago

yeah, i know no way to check for WOL delivery over UDP :-( Note that ping may be VERY delayed due to remote PC's startup procedure (e.g. looking for network booting take quite a loooong time...

malex984 commented 6 years ago

$ ./hilbert poweron --help
usage: hilbert poweron [-h]
                       [--configfile CONFIGFILE | --configdump CONFIGDUMP]
                       [--timeout GLOBAL_TIMEOUT]
                       [--pingtimeout PING_TIMEOUT]
                       StationID

positional arguments:
  StationID             station to power-on via network

optional arguments:
  -h, --help            show this help message and exit
  --configfile CONFIGFILE
                        specify input .YAML file (default: 'Hilbert.yml')
  --configdump CONFIGDUMP
                        specify input dump file
  --timeout GLOBAL_TIMEOUT
                        specify maximum time to wait for a host to respond to ping
  --pingtimeout PING_TIMEOUT
                        specify timeout for ping to wait for a host to respond to ping

malex984 commented 6 years ago

Order of precedence:

CLI arguments
in Hilbert.yml (in WOL record for specific station)
default timeouts:

PING_TIMEOUT: 10 seconds GLOBAL_TIMEOUT: 120 seconds

porst17 commented 6 years ago

Suggestions for new CLI options:

  --timeout TIMEOUT
                        specify time to wait for a host to respond to ping
  --interval INTERVAL
                        specify interval between consecutive attempts to wake up the host

porst17 commented 6 years ago

Note that the options I gave to ping above seem to be specific to macOS. On Linux, the options seem to be different. I couldn't find a replacement for -o in the manpage. But maybe I am blind...

malex984 commented 6 years ago

@porst17 I also found nothing exactly what you described :-( Here are the relevant options of the ping CLI tool from hilbert/mnt image:

-i interval
              Wait interval seconds between sending each packet. The default is to wait for one second between each packet normally, or  not  to  wait  in flood mode. Only super-user may set interval to values less 0.2 seconds.

-w deadline
              Specify  a  timeout, in seconds, before ping exits regardless of how many packets have been sent or received. In this case ping does not stop after count packet are sent, it waits either for deadline expire or until count probes are answered or for some error notification from  network.

-W timeout
              Time to wait for a response, in seconds. The option affects only timeout in absence of any responses, otherwise ping waits for two RTTs.

-A     Adaptive ping. Interpacket interval adapts to round-trip time, so that effectively not more than one (or more, if preload is set)  unanswered probe  is  present in the network. Minimal interval is 200msec for not super-user.  On networks with low rtt this mode is essentially equivalent to flood mode

-c count
              Stop after sending count ECHO_REQUEST packets. With deadline option, ping waits for count ECHO_REPLY packets, until the timeout expires.

elondaits commented 6 years ago

ping -o exits once it receives a successful packet.

It's not standard on linux

hilbert / hilbert-cli

Make Wake-On-LAN more reliable #88