wpnpeiris / nats-gateway

S3 API compatible NATS Object Store
10 stars 2 forks source link

Using this to replicate blobs to Cloudflare s3 or AmazonS3 in many regions. #6

Open gedw99 opened 2 weeks ago

gedw99 commented 2 weeks ago

At the moment the system replicates the blobs to many NATS Object stores, and imitates an S3 stores. So the API is an S3 API.

I am wondering if we can use this as a way to replicate blobs to and S3 provider ?

Say you have 10 servers around the world, each running NATS S3 Gateway and any of them can take a blob mutation, which NATS then holds, and then replicates to different S3 storage systems.

When you want to read the bob, the S3 API is used still. All that happens is that it checks if it's in the NATS Object store. If it is then it returns that, but if it's not then it returns back the URL to the blo in the nearest S3 store. The client then redirects to that URL.

So in a way the NATS Object Store is acting as a Registry and Write Mutation cache of where the blobs really is in S3. If the Blobs has made it to an S3 Store, then its not acting as a Cache, but returned the real location to the S3 blob.

wpnpeiris commented 2 weeks ago

@gedw99 I see this can be achieved with a tool like Rclone. Basically, we just need to configure nats-gateway as a destination in rclone. Then we could use rclone command like sync to replicate buckets with any S3 provider. It should also work with other way round, that is any s3 object storage bucket can also be replicated with Nats object store.

For example, here I have tried to replicate a Nats's bucket with SeaweedFS. Here is the rclone configuration I had.

[seaweedfs]
type = s3
provider = SeaweedFS
access_key_id = xxxxx
secret_access_key = yyyyy
endpoint = http://127.0.0.1:8333
acl = private
region = us-east-1
location_constraint = 1

[nats]
type = s3
provider = Other
access_key_id = na
secret_access_key = na
endpoint = http://127.0.0.1:5222
acl = private
region = us-east-1
location_constraint = 1

And the rclone sync command

rclone sync nats:bucket1 seaweedfs:bucket1
gedw99 commented 2 weeks ago

That’s makes sense regarding reclone.

I guess if this was also something you needed then we would want to make this all real time , rather than a manual thing using the rclone cli?

gedw99 commented 1 week ago

I got this working.

thanks for the examples config.

I have a makefile and docker file that automates all this. if you want it as a PR.. Just let me know.

wpnpeiris commented 1 week ago

I got this working.

thanks for the examples config.

I have a makefile and docker file that automates all this. if you want it as a PR.. Just let me know.

Yes @gedw99 , please open a PR. Would like to have those changes.

gedw99 commented 1 week ago

This is 100% encapsulated with all dependencies also isolated. Works on all Desktops and Servers. Just need make and golang.

Runner is basically Docker compose without docker. I use it for dev and prod on fly.

It's about 80% finished.

.env file


NATS_SERVER_URL=127.0.0.1:4222

CF_DOMAIN=domain.com
CF_SUBDOMAIN=nats-gateway.domian.com
CF_API_EMAIL=xxx@gmail.com
CF_ACCOUNT_ID=XXX
CF_ZONE_ID=xxx

Makefile


# https://github.com/wpnpeiris/nats-gateway

NAME=nats-gateway

BASE_OS_NAME := $(shell go env GOOS)
BASE_OS_ARCH := $(shell go env GOARCH)

#bin folder
BIN_ROOT=$(PWD)/.bin

# data folder
DATA_ROOT :=$(PWD)/.data

# environment
export PATH :=$(PATH):$(BIN_ROOT)
-include .env

print:
    @echo ""
    @echo "-- base --"
    @echo "NAME:                   $(NAME)"
    @echo ""
    @echo "BASE_OS_NAME:           $(BASE_OS_NAME)"
    @echo "BASE_OS_ARCH:           $(BASE_OS_ARCH)"

    @echo ""
    @echo "-- deps --"
    @echo "DEP_GO:                 $(DEP_GO_WHICH)"
    @echo "DEP_S3MANAGER:          $(DEP_S3MANAGER_WHICH)"
    @echo "aws:                    $(shell which aws)"
    @echo "DEP_NATS_SERVER:        $(DEP_NATS_SERVER_WHICH)"
    @echo "DEP_NATS_CLI:           $(DEP_NATS_CLI_WHICH)"
    @echo "DEP_RCLONE_CTL:         $(DEP_RCLONE_CTL_WHICH)"

    @echo "DEP_RUNNER:             $(DEP_RUNNER_WHICH)"
    @echo "DEP_CLOUDFLARE_CTL:     $(DEP_CLOUDFLARE_CTL_WHICH)"
    @echo "DEP_FLY_CTL:            $(DEP_FLY_CTL_WHICH)"

    @echo ""
    @echo "-- env --"
    @echo "SRC_VERSION:           $(SRC_VERSION)"
    @echo "FLY_NAME:              $(FLY_NAME)"

    @echo "CF_DOMAIN:             $(CF_DOMAIN)"
    @echo "CF_API_EMAIL:          $(CF_API_EMAIL)"

    @echo "CF_API_TOKEN:          $(CF_API_TOKEN)"

    @echo "CF_ZONE_ID:            $(CF_ZONE_ID)"
    @echo "CF_ACCOUNT_ID:         $(CF_ACCOUNT_ID)"

all: dep dep-fly src bin
    # Now use "make runner-run" to run all the proceses in the right order

DEP_GO=go
DEP_GO_WHICH=$(shell command -v $(DEP_GO))
ifeq ($(OS_NAME),windows)
    DEP_GO=go.exe
endif

DEP_S3MANAGER=s3manager
DEP_S3MANAGER_WHICH=$(shell command -v $(DEP_S3MANAGER))
ifeq ($(OS_NAME),windows)
    DEP_S3MANAGER=s3manager.exe
endif

DEP_NATS_SERVER=nats-server
DEP_NATS_SERVER_WHICH=$(shell command -v $(DEP_NATS_SERVER))
ifeq ($(OS_NAME),windows)
    DEP_NATS_SERVER=nats-server.exe
endif

DEP_NATS_CLI=nats
DEP_NATS_CLI_WHICH=$(shell command -v $(DEP_NATS_CLI))
ifeq ($(OS_NAME),windows)
    DEP_NATS_SERVER=nats.exe
endif

DEP_RCLONE_CTL=rclone
DEP_RCLONE_CTL_WHICH=$(shell command -v $(DEP_RCLONE_CTL))
ifeq ($(OS_NAME),windows)
    DEP_RCLONE_CTL=rclone.exe
endif

DEP_RUNNER=runner
DEP_RUNNER_WHICH=$(shell command -v $(DEP_RUNNER))
ifeq ($(OS_NAME),windows)
    DEP_RUNNER=runner.exe
endif

DEP_CLOUDFLARE_CTL=flarectl
DEP_CLOUDFLARE_CTL_WHICH=$(shell command -v $(DEP_CLOUDFLARE_CTL))
ifeq ($(DEP_CLOUDFLARE_CTL),windows)
    DEP_CLOUDFLARE_CTL=flarectl.exe
endif

DEP_FLY_CTL=flarectl
DEP_FLY_CTL_WHICH=$(shell command -v $(DEP_FLY_CTL))
ifeq ($(DEP_FLY_CTL),windows)
    DEP_FLY_CTL=flarectl.exe
endif

dep: bin-init

    # go-mod-upgrade to easily upgrade go modules
    # https://github.com/oligot/go-mod-upgrade/releases/tag/v0.10.0
    go install github.com/oligot/go-mod-upgrade@v0.10.0

    # TODO: github cli, so we can easily test upgrade cycles.

    # aws cli. TODO: switch based on OS
    # TODO: find pure golang that meets our needs as its easier.
    brew install awscli

    # s3 manager
    # https://github.com/cloudlena/s3manager/tags
    $(DEP_GO) install github.com/cloudlena/s3manager@latest
    mv $(GOPATH)/bin/$(DEP_S3MANAGER) $(BIN_ROOT)/$(DEP_S3MANAGER)
    rm -f $(GOPATH)/bin/$(DEP_S3MANAGER)

    # nats-server
    # https://github.com/nats-io/nats-server/releases/tag/v2.10.16
    $(DEP_GO) install github.com/nats-io/nats-server/v2@v2.10.16
    mv $(GOPATH)/bin/$(DEP_NATS_SERVER) $(BIN_ROOT)/$(DEP_NATS_SERVER)
    rm -f $(GOPATH)/bin/$(DEP_NATS_SERVER)

    # nats cli
    # https://github.com/nats-io/natscli/releases/tag/v0.1.4
    $(DEP_GO) install github.com/nats-io/natscli/nats@v0.1.4
    mv $(GOPATH)/bin/$(DEP_NATS_CLI) $(BIN_ROOT)/$(DEP_NATS_CLI)
    rm -f $(GOPATH)/bin/$(DEP_NATS_CLI)

    # runner
    # https://github.com/cirello-io/runner/releases/tag/v1.2.2
    $(DEP_GO) install cirello.io/runner@v1.2.2
    mv $(GOPATH)/bin/$(DEP_RUNNER) $(BIN_ROOT)/$(DEP_RUNNER)
    rm -f $(GOPATH)/bin/$(DEP_RUNNER)

    # cloudflare
    # https://github.com/cloudflare/cloudflare-go/releases/tag/v0.97.0
    $(DEP_GO) install github.com/cloudflare/cloudflare-go/cmd/flarectl@v0.97.0
    mv $(GOPATH)/bin/$(DEP_CLOUDFLARE_CTL) $(BIN_ROOT)/$(DEP_CLOUDFLARE_CTL)
    rm -f $(GOPATH)/bin/$(DEP_CLOUDFLARE_CTL)

    # rclone
    # https://github.com/rclone/rclone/releases/tag/v1.67.0
    $(DEP_GO) install github.com/rclone/rclone@v1.67.0
    mv $(GOPATH)/bin/$(DEP_RCLONE_CTL) $(BIN_ROOT)/$(DEP_RCLONE_CTL)
    rm -f $(GOPATH)/bin/$(DEP_RCLONE_CTL)

dep-fly: bin-init

    # fly ( has to be git cloned due to a go.mod replace )
    # https://github.com/superfly/flyctl/releases/tag/v0.2.65
    git clone https://github.com/superfly/flyctl -b v0.2.65
    @echo $(DEP_FLY_CTL) >> .gitignore
    touch go.work
    $(DEP_GO) work use $(DEP_FLY_CTL)
    cd flyctl && go build -o $(BIN_ROOT)/$(DEP_FLY_CTL)
    rm -rf flyctl

    rm -f go.work

    $(MAKE) src-init-post

SRC_VERSION=main
SRC_CMD=cd $(PWD)/$(NAME) &&

src-init-post:
    @echo $(NAME) >> .gitignore
    touch go.work
    $(DEP_GO) work use $(NAME)
src-del:
    rm -rf $(NAME)
src: src-del
    git clone https://github.com/wpnpeiris/nats-gateway -b $(SRC_VERSION)
    $(MAKE) src-init-post
src-fork: src-del
    git clone https://github.com/gedw99/nats-gateway -b $(SRC_VERSION)
    $(MAKE) src-init-post
src-status:
    $(SRC_CMD) git status
    $(SRC_CMD) git branch -a
src-update:
    $(SRC_CMD) git  pull
SRC_BRANCH=gedw99-fix
src-branch:
    $(SRC_CMD) git branch $(SRC_BRANCH)
    $(SRC_CMD) git checkout $(SRC_BRANCH)

BIN_CMD=cd $(PWD)/$(NAME) &&

bin: bin-init bin-native bin-darwin bin-linux bin-windows
bin-init:
    mkdir -p $(BIN_ROOT)
bin-del:
    rm -rf $(BIN_ROOT)
bin-native: bin-init
    $(BIN_CMD) go build -o $(BIN_ROOT)/$(NAME)
bin-darwin:
    $(BIN_CMD) GOOS=darwin GOARCH=amd64 go build -o $(BIN_ROOT)/$(NAME)_darwin_amd64
    $(BIN_CMD) GOOS=darwin GOARCH=arm64 go build -o $(BIN_ROOT)/$(NAME)_darwin_arm64
bin-linux:
    $(BIN_CMD) GOOS=linux GOARCH=amd64 go build -o $(BIN_ROOT)/$(NAME)_linux_amd64
    $(BIN_CMD) GOOS=linux GOARCH=arm64 go build -o $(BIN_ROOT)/$(NAME)_linux_arm64
bin-windows:
    $(BIN_CMD) GOOS=windows GOARCH=amd64 go build -o $(BIN_ROOT)/$(NAME)_windows_amd64
    $(BIN_CMD) GOOS=windows GOARCH=arm64 go build -o $(BIN_ROOT)/$(NAME)_windows_arm64

SERVER_LISTEN=127.0.0.1:8080
SERVER_URL=http://$(SERVER_LISTEN)

main-run:
    # works :)
    # These values will go into the .env, that is used by the docker on fly.
    $(NAME)_$(BASE_OS_NAME)_$(BASE_OS_ARCH) -listen $(SERVER_LISTEN) --natsServers $(NATS_URL) --natsUser $(NATS_USER) --natsPassword $(NATS_PASSWORD) 
    # http://localhost:8080

runner-run:
    $(DEP_RUNNER)
    # http://127.0.0.1:64000

export ACCESS_KEY_ID=111
export SECRET_ACCESS_KEY=111

s3manager-run:
    $(DEP_S3MANAGER)
    http://127.0.0.1:8080

NATS_SERVER_CMD=$(DEP_NATS_SERVER) --config nats.conf
NATS_CLI_CMD=$(DEP_NATS_CLI) --server $(NATS_URL) --user $(NATS_USER) --password $(NATS_PASSWORD)

# nats env
NATS_LISTEN_HOST=127.0.0.1
NATS_LISTEN_PORT=4222
NATS_URL=nats://$(NATS_LISTEN_HOST):$(NATS_LISTEN_PORT)
NATS_USER=user
NATS_PASSWORD=passwordpasswordpasswordpassword
NATS_PASSWORD_BCRYPT=$2a$11$lfIElYSRpcRnutp02RkGBui/0CcB8TepPj5A50dx4Ibsp2DuU4R7G

nats-server-run:
    # see: https://github.com/nats-io/nats-server/tree/main/server/configs
    # nats-server --jetstream --store_dir $(DATA_ROOT) --net $(NATS_LISTEN_HOST) --port $(NATS_LISTEN_PORT) --user $(NATS_USER) --pass $(NATS_PASSWORD)
    $(NATS_SERVER_CMD)
nats-server-reload:
    $(NATS_SERVER_CMD) --signal reload

nats-cli-password-create:
    nats server passwd --pass $(NATS_PASSWORD)
    # copy the result intop the nats.conf
nats-cli-stream-create:
    nats stream add -h
    $(NATS_CLI_CMD) stream ls
    $(NATS_CLI_CMD) stream add $(TEST_BUCKET)
nats-stream-delete:
    $(NATS_CLI_CMD) stream ls
    $(NATS_CLI_CMD) stream rm $(TEST_BUCKET)

RCLONE_URL=?
RCLONE_CMD=$(DEP_RCLONE_CTL) --config $(PWD)/rclone.conf

rclone-run:
    $(RCLONE_CMD) -h
    $(RCLONE_CMD) serve http

    # $(DEP_RCLONE_CTL) sync nats:bucket1 seaweedfs:bucket1
rclone-config:
    $(RCLONE_CMD) config file
    $(RCLONE_CMD) config show

# When using Linux or macOS
export AWS_REGION=us-east-1

TEST_BUCKET=bucket1

test:
    # List NATS Buckets
    aws s3 ls --endpoint-url=$(SERVER_URL)

    # Create a bucket
    aws s3 mb s3://$(TEST_BUCKET) --endpoint-url=$(SERVER_URL)

    # List content of NATS Bucket, bucket1
    aws s3 ls s3://$(TEST_BUCKET) --endpoint-url=$(SERVER_URL)

    # Upload an object to a NATS bucket
    #aws s3 cp file1.txt s3://$(TEST_BUCKET) --endpoint-url=$(SERVER_URL)

    # Download an object from a NATS bucket
    #aws s3 cp s3://$(TEST_BUCKET)/file1.txt file1_copy.txt --endpoint-url=$(SERVER_URL)

FLY_NAME=gedw99-$(NAME)

fly-h:
    fly -h
fly-ext:
    fly ext 
fly-init:
    # assumes you have a custom fly.toml, where you changed the region.
    cp fly.toml $(NAME)/fly.toml
fly-launch: fly-init
    # works :)
    cd $(NAME) && fly launch
    # yes, then no.

    # https://gedw99-datastar.fly.dev
    # https://gedw99-datastar.fly.dev/examples/click_to_edit

    # https://fly.io/apps/gedw99-datastar

fly-deploy: fly-init
    # not used !!!
    cd $(NAME) && fly deploy --local-only

fly-delete:
    cd $(NAME) && fly apps destroy $(FLY_NAME)

fly-scale-h:

    cd $(NAME) && fly scale -h
fly-scale-list:
    cd $(NAME) && fly scale show
fly-scale-lax-0:
    cd $(NAME) && fly scale count 0 --region lax
fly-scale-lax-1:
    cd $(NAME) && fly scale count 1 --region lax
fly-scale-lax-3:
    cd $(NAME) && fly scale count 3 --region lax
fly-scale-sydney-0:
    cd $(NAME) && fly scale count 0 --region syd
fly-scale-sydney-1:
    cd $(NAME) && fly scale count 1 --region syd

fly-domain:
    #cd $(NAME) && fly -h
    cd $(NAME) && fly ip list --json

# https://fly.io/docs/flyctl/consul/
# ENV: FLY_CONSUL_URL
fly-consul-attach:
    cd $(NAME) && fly consul attach
fly-consul-attach-del:
    cd $(NAME) && fly consul detach

fly-s3:
    # where to put image blobs ? Fly lsvd mapped to an CF R2 ?
    # run Rclone to replicate between zones or NATS S3 gateway
    # This needs to work independent of Volmes and ideally on R2
    # R2 does not do multi region, so we need a Replication system like Rclone of NATS S3 - https://github.com/wpnpeiris/nats-gateway

    cd $(NAME) && fly volumes lsvd setup -a $(FLY_NAME)

CF_CMD=$(DEP_CLOUDFLARE_CTL) --account-id $(CF_ACCOUNT_ID)
# CF_ACCOUNT_ID

cf:
    # https://www.mkartchner.com/blog/hosting-flyio-cloudflare-dns
    $(CF_CMD) -h
cf-user:
    $(CF_CMD) user -h
    $(CF_CMD) user info
    #$(CF_CMD) user update
cf-dns:
    $(CF_CMD) dns -h
    $(CF_CMD) dns list $(CF_DOMAIN)
cf-zone:
    $(CF_CMD) zone -h
    $(CF_CMD) zone list
cf-tunnel:
    $(CF_CMD) -h

nats.conf


listen: 127.0.0.1:4222

jetstream {
    store_dir: "./.data/jetstream"
    max_mem: 1G
    max_file: 100G
}

authorization: {
    users: [
        {user: user, password: "$2a$11$lfIElYSRpcRnutp02RkGBui/0CcB8TepPj5A50dx4Ibsp2DuU4R7G"},
    ]
}

Procfile


workdir: .
observe: *.go *.js
ignore: /vendor
build-server: make bin-native
nats: group=nats restart=always make nats-server-run 
server: group=server restart=always waitfor=127.0.0.1:4222 make server-run

I need to make a Docker that can include all the binaries and the ProcFile still. The Docker then starts Runner, which then starts the binaries listed int he Procfile. This will start all the binaries and wait in the right order, just like Docker Compose does.

Regarding the .env file. I need to probably use Consul or NATS Kine instead for Env variable and Service Discovery, so that I can maintain Services that rely on other services across a global cluster. This is a good thing, because then Secrets and Services discovery is the same locally as for a Global cluster. It's just a matter of running Kine in 3 regions that form a cluster thanks to NATS Jetstream. I dont think more would be needed, but not sure yet. https://github.com/k3s-io/kine/blob/master/examples/nats.md

wpnpeiris commented 1 week ago

@gedw99 looks like you had good progress. Looking forward for the PR. :-)

gedw99 commented 1 week ago

Getting there :)

Will let you know.