benhylau / mesh-workshop

Nodes used to facilitate workshops and demos for mesh networking
GNU General Public License v3.0
35 stars 2 forks source link

Occasional memory failures #3

Open benhylau opened 6 years ago

benhylau commented 6 years ago

Observed occasional memory failures when doing memory / storage intensive tasks, in this case building multiple docker images. Tried building the nodejs alpine image, which failed, then go-ipfs image (which usually succeeds) failed as well:

fatal: 'go version' failed with output: panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x2c pc=0x13ae50]

goroutine 1 [running]:
text/template.init()
        /usr/local/go/src/text/template/exec.go:610 +0x94
go/doc.init()
        <autogenerated>:1 +0x50
go/build.init()
        <autogenerated>:1 +0x50
cmd/go/internal/cfg.init()
        <autogenerated>:1 +0x44
cmd/go/internal/base.init()
        <autogenerated>:1 +0x64
main.init()
        <autogenerated>:1 +0x5c
make: *** [mk/golang.mk:52: check_go_version] Error 1

This is only observed when building the images. Running images worked fine.

benhylau commented 6 years ago

@hamishcoleman here's another one^

hamishcoleman commented 6 years ago

is it running out of memory? Are there any kernel error messages? Is this running on the actual mesh-orange image? Which version? Which hardware?

As I have previously mentioned, it is not really a "normal" workload to run docker on a ramdisk, so that might be affecting things too

benhylau commented 6 years ago

This repository uses Raspberry Pi 3 only, and I used this custom version of mesh-orange which has these extra deb packages but otherwise pretty standard.

I had the 6 GB of swap memory on and when this happens free still shows most of that free, and I recall df showed the rootfs was about a third full. There weren't kernel errors or other logs indicative of oom.

This never happened when running docker images, only when I tried to build a bunch of docker images. The nodejs one failed a couple times (because I exited my sessions, etc.) and then I tried building an ipfs image (which always succeeds if I hadn't tried building the nodejs first). So this is some host memory issue and I am not sure if I can reproduce it, just logging it here and see if it happens again.