Open sebbacon opened 1 year ago
Shorter version
Tue Sep 27 23:57:12 BST 2022: Awake; pausing for 30s
Tue Sep 27 23:57:42 BST 2022: Pause finished
Tue Sep 27 23:57:42 BST 2022: Refreshing calendar <- final bit in start.sh
Tue Sep 27 23:57:42 BST 2022: before run.py <- in common.sh
<error happens>
2022-09-27 23:57:47.477680: Getting data <- first line of try block
2022-09-27 23:57:47.532928: Caught exception <- exception handler, it then outputs the the traceback (even thought it's appearing earlier)
2022-09-27 23:57:47.571067: Continuing... <- last line of try block
Tue Sep 27 23:57:48 BST 2022: after run.py <- first line after python is executed
<press wake and it appears to unset the screen, but stay frozen>
Wed Sep 28 18:50:32 BST 2022: now deep sleeping for one hour <- this is after 5s sleep that is immediately after common.sh
<press wake again, it actually wakes>
Wed Sep 28 18:53:05 BST 2022: 1664391031; standby mem <- first thing it does on wake
I added more logging.
Starting from the first log line on 25 Oct:
2022-10-25 00:45:45.184972: Getting data
2022-10-25 00:45:45.240291: Caught exception
2022-10-25 00:45:45.278740: Continuing...
Tue Oct 25 00:45:45 BST 2022: after run.py
Tue Oct 25 00:45:46 BST 2022: wrote calendar to screen
Tue Oct 25 00:45:46 BST 2022: Finished running common.sh
Tue Oct 25 00:45:51 BST 2022: now deep sleeping for one hour
Tue Oct 25 01:45:52 BST 2022: ; standby mem
Tue Oct 25 01:45:52 BST 2022: Awake; pausing for 30s
Tue Oct 25 01:46:22 BST 2022: Pause finished
Tue Oct 25 01:46:22 BST 2022: Refreshing calendar
Tue Oct 25 01:46:22 BST 2022: before run.py
--
socket.gaierror: [Errno -3] Temporary failure in name resolution
etc
--
2022-10-25 01:46:26.928099: Getting data
2022-10-25 01:46:26.983242: Caught exception
2022-10-25 01:46:27.021445: Continuing...
Tue Oct 25 01:46:27 BST 2022: after run.py
Tue Oct 25 01:46:28 BST 2022: wrote calendar to screen
Tue Oct 25 01:46:28 BST 2022: Finished running common.sh
Tue Oct 25 01:46:33 BST 2022: now deep sleeping for one hour
Tue Oct 25 02:46:33 BST 2022: ; standby mem
Tue Oct 25 02:46:33 BST 2022: Awake; pausing for 30s
Tue Oct 25 02:47:03 BST 2022: Pause finished
Tue Oct 25 02:47:03 BST 2022: Refreshing calendar
Tue Oct 25 02:47:03 BST 2022: before run.py
socket.gaierror: [Errno -3] Temporary failure in name resolution
etc
2022-10-25 02:47:07.647423: Getting data
2022-10-25 02:47:07.706288: Caught exception
2022-10-25 02:47:07.744864: Continuing...
Tue Oct 25 02:47:08 BST 2022: after run.py
Tue Oct 25 02:47:09 BST 2022: wrote calendar to screen
Tue Oct 25 02:47:09 BST 2022: Finished running common.sh
Tue Oct 25 02:47:14 BST 2022: now deep sleeping for one hour
Tue Oct 25 03:47:14 BST 2022: ; standby mem
Tue Oct 25 03:47:14 BST 2022: Awake; pausing for 30s
and that's the last line on 25th - followed by nothing for 2 whole days, until:
Thu Oct 27 00:45:23 BST 2022: Pause finished
Thu Oct 27 00:45:23 BST 2022: Refreshing calendar
Thu Oct 27 00:45:23 BST 2022: before run.py
socket.gaierror: [Errno -3] Temporary failure in name resolution
etc
So it seems like the kindle is getting into a state where it's unable to resolve DNS; and this state is reset by me pressing the restart button.
Some online have found it's a firewall issue Others suggest it's to do with HTTP connections not being closed (discussions)
Here's a good description of the underlying error, which is as you'd think; fundamentally DNS seems to have disappeared.
Interestingly, resolv.conf
appears to be rewritten on a soft start:
[root@kindle root]# cat /etc/resolv.conf
domain home
nameserver 208.67.222.222
nameserver 208.67.220.220
[root@kindle root]# ls -l /etc/resolv.conf
lrwxrwxrwx 2 root root 20 Mar 2 2021 /etc/resolv.conf -> /var/run/resolv.conf
[root@kindle root]# ls -l /var/run/resolv.conf
-rw-rw-r-- 1 root javauser 64 Oct 31 08:42 /var/run/resolv.conf
So there's an upstart script chroot.conf
which is marked as start on framework_ready
; it mounts the root filesystem as rw, makes a bunch of folders in /chroot/
, and then mounts them at the usual locations:
chroot_mount()
{
# .... stuff ...
mount -o bind /etc/resolv.d /chroot/etc/resolv.d || true
}
However, note that this makes a folder and doesn't touch resolv.conf
itself. At the end of the chroot.conf
upstart config there is a line ln /etc/resolv.conf /chroot/etc/resolv.conf || true
but the currently-activated config is not
Other candidates to examine:
ppp/ip-up
ppp/ip-down
upstart/chroot.conf
upstart/firsttime
It's quite complicated, but I think it's probably all happening in /etc/ppp/ip-up
; full source here for future ref:
#!/bin/sh
_FUNCTIONS=/etc/rc.d/functions
[ -f ${_FUNCTIONS} ] && . ${_FUNCTIONS}
. /var/local/wan/info
PPP_IP_START=/var/run/ppp/ip_start
PPP_RESOLV_CONF=$(readlink /etc/resolv.conf)
PPP_RESOLV_SAVE=/var/run/resolv.conf.save
PPP_RESOLV_NEW="/etc/resolv.d/resolv.conf.${WAN_PEER}"
RESOLV_OPTIONS="options timeout:5 attempts:2 rotate\n"
LIPC_HASH_PROP=/usr/bin/lipc-hash-prop
LIPC_GET_PROP=/usr/bin/lipc-get-prop
INTERFACE="$1"
TTY_DEVICE="$2"
SPEED="$3"
LOCAL_IP="$4"
GATEWAY_IP="$5"
DNS1_NAMESERVER=""
DNS2_NAMESERVER=""
if [ -z "${USEPEERDNS}" -a ! -r ${PPP_RESOLV_NEW} ]; then
msg "unknown WAN peer (${WAN_PEER})" E
fi
if [ 0 -ne "$(${LIPC_GET_PROP} -i com.lab126.wan shouldRoute)" ]; then
msg "PPP adding route to default gw and resolv.conf file" I
cp ${PPP_RESOLV_CONF} ${PPP_RESOLV_SAVE}
cp ${PPP_RESOLV_NEW} ${PPP_RESOLV_CONF}
route add default gw "${GATEWAY_IP}" "${INTERFACE}"
fi
#
# ppp will call ip-up with USEPEERDNS defined if the usepeerdns option is set.
# In whispernet2.0 sims where we use VPN, our nameservers are hardwired and
# based on /etc/resolv.d files.
# In whispernet3.0 sims we use the dns server as provided by our peer via ipcp
# so the usepeerdns option is set. See /platform/etc/ppp/peers/peer-wn3-3 as
# an example.
#
if [ -n "${USEPEERDNS}" ]; then
msg "Using Peer DNS (xxx.xxx.$(echo ${DNS1} | cut -d '.' -f 3,4)) and (xxx.xxx.$(echo ${DNS2} | cut -d '.' -f 3,4))" I
else
msg "PPP Using DNS from (${PPP_RESOLV_NEW})" I
DNSSERVER=1
for DNS in $(cat ${PPP_RESOLV_NEW} \
| sed -e 's/^nameserver[ \t]\+\([0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+\).*$/\1/; t; d'); do
if [ "${DNSSERVER}" == "1" ]; then
DNS1=${DNS}
elif [ "${DNSSERVER}" == "2" ]; then
DNS2=${DNS}
fi
let "DNSSERVER += 1"
done
msg "Using ${PPP_RESOLV_NEW} nameservers xxx.xxx.$(echo ${DNS1} | cut -d '.' -f 3,4) and xxx.xxx.$(echo ${DNS2} | cut -d '.' -f 3,4)" I
fi
if [ -n "{DNS1}" -a "${DNS1}" != "0.0.0.0" ]; then
route add -host ${DNS1} gw ${GATEWAY_IP} ${INTERFACE}
DNS1_NAMESERVER="nameserver ${DNS1}\n"
fi
if [ -n "{DNS2}" -a "${DNS2}" != "0.0.0.0" ]; then
route add -host ${DNS2} gw ${GATEWAY_IP} ${INTERFACE}
DNS2_NAMESERVER="nameserver ${DNS2}\n"
fi
# /usr/bin/lipc-send-event com.lab126.hal wanPPPUp
echo -e "{ cmIpAddr = \"${LOCAL_IP}\",
cmDefaultGW = \"${GATEWAY_IP}\",
cmNetMask = \"255.255.255.255\",
cmResolvConf = \"${DNS1_NAMESERVER}${DNS2_NAMESERVER}${RESOLV_OPTIONS}\" }" \
| ${LIPC_HASH_PROP} com.lab126.wan cmIntfInfo
echo PPP interface up ${INTERFACE} ${TTY_DEVICE} ${SPEED} ${LOCAL_IP} ${GATEWAY_IP} > ${PPP_IP_START}
msg "PPP interface up ${INTERFACE} ${TTY_DEVICE} ${SPEED} xxx.xxx.$(echo ${LOCAL_IP} | cut -d '.' -f 3,4) xxx.xxx.$(echo ${GATEWAY_IP} | cut -d
'.' -f 3,4) nameserver xxx.xxx.$(echo ${DNS1} | cut -d '.' -f 3,4) nameserver xxx.xxx.$(echo ${DNS2} | cut -d '.' -f 3,4)" I
For now, I've just added a line that logs /etc/resolv.conf
in each loop of start.sh
, to see if that changes when things break; if not, I could also add some DNS debug commands in there too.
I'm trying out a reboot after a wakeup, to see if that fixes the network.
There then seems to be a problem with mounted files being available (which is weird as the file doing the including is
Tue Mar 14 10:23:46 GMT 2023: nslookup check is Server: 208.67.222.222
Address 1: 208.67.222.222
Tue Mar 14 10:23:46 GMT 2023: Refreshing calendar
/mnt/base-us/extensions/dashboard/bin/start.sh: source: line 26: can't open '/mnt/base-us/extensions/dashboard/bin/common.sh': No such file or directory
Trying to work out the upstart events that are triggered when available - breadcrumbs here look useful
https://www.timelesssky.com/blog/create-new-partition-on-kindle-touch
Sometimes, there's a network error.
The sleep screen freezes at a certain time.
If I press the wake button, I see the last active screen (KUAL buttons), but it doesn't respond to gestures.
I press the wake button again and the screensaver goes on.
I press it again, and I'm in normal mode.
Example logs: