bemasher / rtlamr

An rtl-sdr receiver for Itron ERT compatible smart meters operating in the 900MHz ISM band.
GNU Affero General Public License v3.0
2.19k stars 249 forks source link

Core dump using RPi 0 and alpine container #140

Closed petewall closed 4 years ago

petewall commented 4 years ago

Hello, I am using your great rtlamr tool to query my SDR and read my power meter. A few days ago, it stopped working, and I'm trying to figure out why.

I am using a Raspberry Pi Zero which is running an alpine Docker container with your code:

FROM arm32v6/alpine:3.11.3

LABEL description="An image for running the an rtlamr based radio sensor"
LABEL maintainer="Pete Wall <pete@petewall.net>"

ARG RTLAMR_VERSION=0.9.1
ARG TIMEZONE=America/Chicago
RUN echo http://dl-cdn.alpinelinux.org/alpine/edge/testing >> /etc/apk/repositories && \
    apk update && \
    apk upgrade && \
    apk add jq mosquitto-clients rtl-sdr tzdata && \
    ln -sf /usr/share/zoneinfo/${TIMEZONE} /etc/localtime
RUN wget https://github.com/bemasher/rtlamr/releases/download/v${RTLAMR_VERSION}/rtlamr_linux_arm.tar.gz && \
    tar xzvf rtlamr_linux_arm.tar.gz -C /usr/local/bin/ rtlamr && \
    rm rtlamr_linux_arm.tar.gz && \
    chmod a+x /usr/local/bin/rtlamr

COPY ["run.sh", "scan.sh", "./"]

ENV INTERVAL=600
CMD ./run.sh

The run script will scan for my power meter, return the reading which I then pass to my MQTT broker:

#!/bin/sh

while true; do
  echo "Starting rtl_tcp..."
  rtl_tcp &

  sleep 5

  # Sample reading: {"Time":"2019-09-21T05:07:53.209044867Z","Offset":0,"Length":0,"Type":"SCM","Message":{"ID":48017217,"Type":5,"TamperPhy":2,"TamperEnc":0,"Consumption":3840220,"ChecksumVal":52591}}
  echo "Watching for a ${DEVICE_PROTOCOL} message from device id ${DEVICE_ID}..."
  reading=$(rtlamr \
              -duration=1m \
              -single=true \
              -format=json \
              -filterid="${DEVICE_ID}" \
              -msgtype="${DEVICE_PROTOCOL}")
  if [[ $? -eq 0 ]] ; then
    mosquitto_pub \
      -h "${MQTT_HOSTNAME}" \
      -u "${MQTT_USERNAME}" \
      -P "${MQTT_PASSWORD}" \
      -t "${MQTT_TOPIC}" \
      -m "$(echo $reading | jq --arg date "$(date)" '.Message.Consumption | {"date": $date, "power": .}')"
  else
    echo "Failed to find message"
  fi

  echo "Stopping rtl_tcp..."
  pkill rtl_tcp

  sleep "${INTERVAL}"
done

This used to work, but now it is core dumping:

Starting rtl_tcp...
Found 1 device(s):
  0:  Realtek, RTL2838UHIDIR, SN: 00000001
power-sensor_1              |
Using device 0: Generic RTL2832U OEM
Found Rafael Micro R820T tuner
[R82XX] PLL not locked!
Tuned to 100000000 Hz.
listening...
Watching for a scm message from device id 4___REDACTED___7...
Illegal instruction (core dumped)
Failed to find message
Stopping rtl_tcp...
Signal caught, exiting!
Use the device argument 'rtl_tcp=127.0.0.1:1234' in OsmoSDR (gr-osmosdr) source
to receive samples in GRC and control rtl_tcp parameters (frequency, gain, ...).
bye!

Can you help me determine where the issue is? I've attached the core file core.zip

bemasher commented 4 years ago

Go depends on glibc, Alpine linux uses musl by default, you'll need to install it for any of the pre-compiled binaries to work.

bemasher commented 4 years ago

You will also want to reset the password used by your MQTT client because it was exposed in the core dump you uploaded.

petewall commented 4 years ago

Changed to use the the golang images and fetching rtlamr with go get:

FROM arm32v6/golang:1.13.6-alpine3.10
...
RUN echo http://dl-cdn.alpinelinux.org/alpine/edge/testing >> /etc/apk/repositories && \
    apk update && \
    apk upgrade && \
    apk add git jq libc6-compat mosquitto-clients rtl-sdr tzdata && \
    ln -sf /usr/share/zoneinfo/${TIMEZONE} /etc/localtime
RUN go get github.com/bemasher/rtlamr
...

This seems to satisfy the issue of runtime libraries:

/go # ldd /go/bin/rtlamr
    /lib/ld-musl-armhf.so.1 (0xb6ec3000)
    libc.musl-armhf.so.1 => /lib/ld-musl-armhf.so.1 (0xb6ec3000)

Now, a different error when running:

Starting rtl_tcp...
Found 1 device(s):
  0:  Realtek, RTL2838UHIDIR, SN: 00000001
power-sensor_1              |
Using device 0: Generic RTL2832U OEM
Found Rafael Micro R820T tuner
[R82XX] PLL not locked!
Tuned to 100000000 Hz.
listening...
Watching for a scm message from device id 4___REDACTED___7...
14:36:19.226601 decode.go:45: CenterFreq: 912600155
14:36:19.238097 decode.go:46: SampleRate: 2359296
14:36:19.240233 decode.go:47: DataRate: 32768
14:36:19.241914 decode.go:48: ChipLength: 72
14:36:19.248947 decode.go:49: PreambleSymbols: 21
14:36:19.252359 decode.go:50: PreambleLength: 3024
14:36:19.256260 decode.go:51: PacketSymbols: 96
14:36:19.258987 decode.go:52: PacketLength: 13824
14:36:19.262292 decode.go:59: Protocols: scm
14:36:19.262606 decode.go:60: Preambles: 111110010101001100000
14:36:19.262948 main.go:119: GainCount: 29
Signal caught, exiting!
14:36:24.269044 main.go:332: read tcp 127.0.0.1:44040->127.0.0.1:1234: i/o timeout
io.ReadFull
main.(*Receiver).Run.func1
    /go/src/github.com/bemasher/rtlamr/main.go:174
runtime.goexit
    /usr/local/go/src/runtime/asm_arm.s:868
Failed to find message
Stopping rtl_tcp...
strikeir13 commented 4 years ago

I'm having the same io.ReadFull issue on a Raspberry Pi3B+ (Buster). rtlamr will sometimes find one meter before this error occurs:

11:47:45.883263 main.go:332: read tcp 127.0.0.1:39284->127.0.0.1:1234: i/o timeout
io.ReadFull
main.(*Receiver).Run.func1
        /home/pi/go/src/github.com/bemasher/rtlamr/main.go:174
runtime.goexit
        /usr/lib/go-1.11/src/runtime/asm_arm.s:867
bemasher commented 4 years ago

@petewall I can't really speak about the RPi Zero issue, it was never a development target for rtlamr and I would guess that it isn't quite powerful enough to run rtlamr at full bandwidth.

@strikeir13 Are you running any other processes on your pi other than the usual background stuff?

strikeir13 commented 4 years ago

@bemasher the only other process besides those found in a typical Buster installation is https://github.com/Will1604/infinitive which I can't imagine is all that resource-intensive.

val123456 commented 4 years ago

I've been doing a lot of testing in Docker on RP3B+/4s over the past few months and have never seen that error. Doing it differently, though. See https://github.com/val123456/Dockerized-rtlamr-to-SQLite.

Just for grins I setup a spare 3B+ and will let it run for a while.

val123456 commented 4 years ago

Just for grins I setup a spare 3B+ and will let it run for a while.

Running 9 days with 2.13 TB of data sent between the rtl_tcp container and the rtlamr container with no issues so far on a RPi3...

strikeir13 commented 4 years ago

Just for grins I setup a spare 3B+ and will let it run for a while.

Running 9 days with 2.13 TB of data sent between the rtl_tcp container and the rtlamr container with no issues so far on a RPi3...

@val123456 Is this in docker? I was attempting to run it via a systemd service on my RPi3B+. I will look into your docker solution and hope that I have better results.

val123456 commented 4 years ago

@val123456 Is this in docker? I was attempting to run it via a systemd service on my RPi3B+. I will look into your docker solution and hope that I have better results.

@strikeir13 yes, in docker. Documentation is a work-in-progress, but code is here: https://github.com/val123456/Dockerized-rtlamr-to-SQLite. Uses docker-compose.

val123456 commented 4 years ago

As @bemasher states in docs, etc. running on a low power system can be an issue.

Looking at data for my systems (one Xeon server, one RPi3B+, one RPi4), with rtlamr filtering for SCM and R900 meters, looking by ID for two SCM electric meters, one SCM gas meter, and one R900 water meter, and unique set to true. Stats for 5 or so random days, I get about 1400-1600 readings per day on the Xeon and RPi4 (almost exactly the same, varies by 2-5/day. The RPi3B+ was about 20% lower. All collocated, only a few feet from the meters.

bemasher commented 4 years ago

@val123456 the R900 decoder is the only protocol that isn't quite efficient enough to run at full bandwidth on an RPi. I expect if you limited your message types to scm or idm, you would receive all the same messages as your Xeon system.

bemasher commented 4 years ago

No activity, closing.