chainpoint / chainpoint-gateway

Chainpoint Gateway
https://chainpoint.org
Apache License 2.0
27 stars 12 forks source link

Add alternative Dockerfile based on Alpine #23

Closed filoozom closed 5 years ago

filoozom commented 6 years ago

This is a follow-up PR for #21 but only for Alpine. As it is a development file, I didn't add tini, but it should be as easy as running apk add --no-cache tini (it is available in the community repo: https://pkgs.alpinelinux.org/packages?name=tini&branch=v3.8).

I also changed the architecture of the Dockerfile a bit, and will open a separate PR to implement these changes in the current Dockerfile once this one is validated. Key changes:

Regarding the rocksdb build artifacts, this might not be the cleanest way to remove them. My thinking is that I can't really build it in a build container and then move the file, because its install script would still try to build it: https://github.com/Level/rocksdb/blob/master/package.json#L44.

The resulting image is now 119MB instead of 1.04GB! 😄

jacohend commented 6 years ago

@filoozom

Thanks for doing this! Always a fan of alpine or debian slim, definitely a size improvement.

I'm having a slight issue with the gosu node entrypoint. When I build with this line, I log

chainpoint-node_1 | ERROR : Unable to open database : IO error: /home/node/app/rocksdb/LOCK: Permission denied

I think maybe node doesn't have sufficient permissions relative to rocksdb to open the connection. When I drop the gosu line I have no issues. I've also checked my local rocksdb data directory volume and the permissions look ok. Does this issue look familiar? Can you replicate?

jacohend commented 6 years ago

@filoozom

I've worked past issues with gosu, looked like an issue with docker volume permissions on the local directory.

I've now run into error Command failed with signal "SIGILL". errors shortly after the node process starts. I'm wondering if this is related to the way alpine is building rocksdb. Do you see this issue on your own dev environment?

filoozom commented 6 years ago

Yes, the permission denied error is probably why you have this: https://github.com/chainpoint/chainpoint-node-src/blob/2f8a9dc67d0eee44953ed8d34cb70cb52a26293e/Makefile#L63-L66

About that SIGILL... Don't know how I didn't catch that. Initially I thought it was because I was running Alpine for arm32v6 on a arm32v7 OS on a aarch64 CPU (which should actually work), but looks like it's actually because of the rocksdb addon. I'll try to debug it.

Funnily enough, the whole thing actually works with a debug build of rocksdb, simply by compiling with node-gyp rebuild --debug.

Confirmed broken on node:8.11.3-alpine too.

filoozom commented 6 years ago

Okay, so first step to fix this: https://github.com/Level/rocksdb/pull/78.

I still get the following error after some time: https://github.com/chainpoint/chainpoint-node-src/blob/1750a50a5dd1dc25b4bb39f9790a47469d93f6e6/lib/calendar.js#L389 And then on restart: ERROR : App : Startup : Unable to read top calendar block : The "offset" argument must be of type number. Received type object. In my case, topCalendarBlockHeightValue is <Buffer 00 21 6c a3> in https://github.com/chainpoint/chainpoint-node-src/blob/e980c7903a55a0614719873c6641235dfa9431ef/lib/models/RocksDB.js#L283-L285

Doesn't the following code look odd? Works perfectly well with topBlockHeightValue.readUInt32BE(0). https://github.com/chainpoint/chainpoint-node-src/blob/e980c7903a55a0614719873c6641235dfa9431ef/lib/models/RocksDB.js#L228-L230

$ docker run --rm -it node:alpine
> let buf = Buffer.from([0x00, 0x21, 0x6c, 0xa3])
> buf.readUInt32BE(buf)
TypeError [ERR_INVALID_ARG_TYPE]: The "offset" argument must be of type number. Received type object
    at checkNumberType (internal/buffer.js:42:11)
    at Buffer.readUInt32BE (internal/buffer.js:194:3)

$ docker run --rm -it node:8-alpine
> let buf = Buffer.from([0x00, 0x21, 0x6c, 0xa3]) 
undefined
> buf.readUInt32BE(buf)
2190499

Guess that's another issue solved? Sorry if my comments aren't really structured, there are a lot of edits happening.

Maybe moving to Alpine is not the best idea, but I'll dive into this a bit more, maybe it's an easy fix. Might be a good idea to consider moving to a more recent version of rocksdb though, one that isn't two years old. 😄

filoozom commented 6 years ago

In summary:

I have a node running with https://github.com/Level/rocksdb/pull/79 and #24. As far as I can tell, it works fine, at least for the sync and to validate recent blocks. It also starts up without any issues.

Next step: test on ARM.

If I add tini back and use node:8.11.3-alpine as a base image, this might actually be a good candidate to replace the current Dockerfile altogether.

grempe commented 6 years ago

I had started some review comments yesterday but just got around to submitting. Glad to see progress being made here.

jacohend commented 6 years ago

@filoozom

We're still getting the SIGILL error after incorporating the latest from this branch and TopBlockHeightValue fix, then rebuilding with the --no-cache flag on x86_64. You say you can run it successfully, though?

filoozom commented 6 years ago

Yes, we're still waiting for my previously mentioned PR in https://github.com/level/rocksdb to get merged. In the meantime, the only way to get it to work is to install the rocksdb dependency from there: https://github.com/filoozom/rocksdb/tree/upgrade-snappy, and then to rebuild of course.

filoozom commented 6 years ago

@jacohend https://github.com/Level/rocksdb/pull/79 just got merged, rocksdb@3.0.2 was published and all of this should now work. 😉

jacohend commented 6 years ago

@filoozom Thanks! With the rocksdb merge and a clean rebuild the node now looks to be validating all the blocks without any arch-related crashes.

@grempe I think this is ready to go

filoozom commented 6 years ago

Might as well use the opportunity to move to NodeJS 10, the new LTS: https://nodejs.org/en/blog/release/v10.13.0/.

filoozom commented 5 years ago

By the way, I've been running this in production successfully on a few hundred nodes for the past month. 🙂

grempe commented 5 years ago

LGTM. I've merge it in, and we can see if anyone else wants to run this way. Thanks for all the hard work and experimentation.