rosskukulinski / kubernetes-rethinkdb-cluster

RethinkDB cluster on top of Kubernetes made easy.
MIT License
139 stars 40 forks source link

QUESTION: additional instructions on using code #21

Closed gtamir closed 8 years ago

gtamir commented 8 years ago

Hi, I have very little experience with kubernetes and I'm trying to use your code to deploy a rethinkdb instance to GKE, so sorry if I'm asking the wrong questions...

Can you please provide more details about what is required for using this setup?

For example:

  1. Do I need to use your custom image or can I use rethinkdb's generic one?
  2. If I need to use yours, do I need to push it to my private container registry in order to use it?
  3. Regarding variables that need to be updated in the yaml files. If I understand correctly I need to update the following:
    • _PODNAMESPACE - I assume this should be the namespace my other pods are using ("default" in my case)
    • _PODNAME - is this just a random name I should define?
    • _PODIP - I don't know what that is or what it should be

Anything else I need to change?

Thank you!

rosskukulinski commented 8 years ago

Hi @gtamir.

You need to either use my custom image (image tag is rosskukulinski/rethinkdb-kubernetes:2.3.4) or build the /images directory yourself and push to your private registry. The vanilla rethindkb image will not work.

For updating variables:

gtamir commented 8 years ago

Thank you @rosskukulinski. If I use your image do I need to push it to the registry too or does kubernetes knows to pull images from docker hub?

gtamir commented 8 years ago

In addition to my previous comment, when I try to use the quickstart.yaml I see errors in the pods logs:

Attempting to get canonical-address Detected canonical-address: .... Final canonical-address: .... additional CLI flags Checking for other nodes My host: .... Namespace: default Endpont url: https://..../api/v1/namespaces/default/endpoints/rethinkdb Looking for IPs... jq: error: Cannot iterate over null Cannot start in proxy mode, no ENDPOINT available

Any idea what's wrong?

rosskukulinski commented 8 years ago

Kubernetes will know to pull from docker hub.

rosskukulinski commented 8 years ago

@gtamir once the replica node comes up, do the errors in the admin/proxy pods go away?

gtamir commented 8 years ago

@rosskukulinski if you mean the replica pod, then it was in status RUNNING, but now all three pods are in status CrashLoopBackOff and keep restarting

gtamir commented 8 years ago

@rosskukulinski when I run describe pod on the replica it says: "container "rethinkdb" is unhealthy, it will be killed and re-created."

chriswessels commented 8 years ago

I'm experiencing a similar issue. Attempting to start on a GKE cluster running 1.3.5. After creating the services, I created the rethinkdb-replica-1 deployment. The pod keeps restarting with the following log output:

Attempting to get canonical-address
Detected canonical-address: 10.112.0.3
Final canonical-address: 10.112.0.3
additional CLI flags --cache-size 100
Checking for other nodes
My host: 10.112.0.3
Namespace: default
Endpont url: https://10.115.240.1:443/api/v1/namespaces/default/endpoints/rethinkdb
Looking for IPs...
Start single instance
rethinkdb --canonical-address 10.112.0.3 --bind all --cache-size 100
Running rethinkdb 2.3.4~0jessie (GCC 4.9.2)...
Running on Linux 3.16.0-4-amd64 x86_64
Loading data from directory /data/rethinkdb_data
warn: Cache size is very low and may impact performance.
Listening for intracluster connections on port 29015
Listening for client driver connections on port 28015
Listening for administrative HTTP connections on port 8080
Listening on cluster addresses: 127.0.0.1, 10.112.0.3, ::1, fe80::454:3eff:fe7e:6b46%3
Listening on driver addresses: 127.0.0.1, 10.112.0.3, ::1, fe80::454:3eff:fe7e:6b46%3
Listening on http addresses: 127.0.0.1, 10.112.0.3, ::1, fe80::454:3eff:fe7e:6b46%3
Server ready, "rethinkdb_replica_1_4173252858_jdkhk_ids" 1cba88a3-7fe1-493e-b1a1-1de30db06f1b
A newer version of the RethinkDB server is available: 2.3.5. You can read the changelog at <https://github.com/rethinkdb/rethinkdb/releases>.
Server got SIGTERM from pid 0, uid 0; shutting down...
Shutting down client connections...
All client connections closed.
Shutting down storage engine... (This may take a while if you had a lot of unflushed data in the writeback cache.)
Storage engine shut down.
rosskukulinski commented 8 years ago

ahhh -- Thanks for reporting @chriswessels @gtamir! I had incorrectly pushed the 2.3.4 tag with a previous image that didn't have the readiness check script. Please try again - 2.3.4 tag now working correctly. I also added a 2.3.5 version which is being built now.

geoah commented 8 years ago

@rosskukulinski what is the current state of 2.3.4?

I was using the 2.3.4 tag just fine but things seem a bit broken now with the same configs that were working fine 30 mins ago.

Currently getting errors such as

error: Received inconsistent routing information (wrong address) from 10.3.0.45:29015 (expected_address = peer_address [10.3.0.45:29015], other_peer_addr = peer_address [10.2.58.21:29015]), closing connection.  Consider using the '--canonical-address' launch option.

Does this relate to changes in the newly pushed 2.3.4?

chriswessels commented 8 years ago

@rosskukulinski Fantastic, thanks. I've got 3 replicas running with the latest manifests and the 2.3.5 image.

However, when I attempt to schedule the rethinkdb-admin deployment, I get:

Using additional CLI flags: --cache-size 100
Pod IP: 10.112.0.4
Pod namespace: default
Using service name: rethinkdb
Using server name: rethinkdb_admin_1284164625_t2hh8
Checking for other nodes...
Using endpoints to lookup other nodes...
Endpoint url: https://10.115.240.1:443/api/v1/namespaces/default/endpoints/rethinkdb
Looking for IPs...
Found other nodes: 10.112.0.3
Starting in proxy mode
+ exec rethinkdb proxy --canonical-address 10.112.0.4 --bind all --join 10.112.0.3:29015 --cache-size 100
Error in the command line: Unrecognized option '--cache-size'.
Run 'rethinkdb help proxy' for help on the command

I guess --cache-size is irrelevant for a proxy node?

Edit: Indeed, if I remove the args section from the rethinkdb-admin.yml manifest, it starts successfully.

gtamir commented 8 years ago

@rosskukulinski The quickstart.yml with 2.3.4 is working now. Going to try the persistent storage ones next. Thank you

rosskukulinski commented 8 years ago

@geoah - Could you post your deployment yaml? I thought I've had --canonical-address flag specified for sometime now.

rosskukulinski commented 8 years ago

@gtamir Good catch, sorry about that. I've updated the 2.3.4, 2.3.5, and master branches to resolve that.

gtamir commented 8 years ago

@rosskukulinski glad I could help.

BTW, should I follow @chriswessels previous comment on 2.3.5 remove the argssection from the rethinkdb-admin.ymlmanifest?

Seeing same issue on both admin and proxy pods.

rosskukulinski commented 8 years ago

@gtamir - yep, remove the args section for both admin and proxy pods. Alternatively, you can do a git pull as I've updated the repo accordingly.

gtamir commented 8 years ago

@rosskukulinski Perfect. Thank you!

rosskukulinski commented 8 years ago

closing this as I believe all issues have been resolved.

chriswessels commented 8 years ago

Thanks @rosskukulinski