Closed DrissiReda closed 3 years ago
cc @jsanda @emerkle826
Hi @DrissiReda,
It is by design that nodetool is not in image used to run the gRPC service. When Medusa is run inside Kubernetes, it won't have access to nodetool. It makes REST api requests to the management-api (used by cass-operator) or to Jolokia for creating snapshots.
The image is really only intended for use inside of Kubernetes. I would not expect it to work outside of a Kubernetes deployment.
Thanks @jsanda
I’m trying to deploy it in a kubernetes cluster but I can’t, which is why I’m trying to debug it with docker.
I've seen the jolokia in the deployment, I even have it in the conf file, but it feels like it's ignored: For example I can't see it in the CassandraDataCenter CRD log:
I can't see any call to localhost:7373/jolokia in any of the logs.
If you need more logs:
A simple curl localhost:7373/jolokia/ from inside medusa container returns:
{"request":{"type":"version"},"value":{"agent":"1.6.2","protocol":"7.2","config":{"listenForHttpService":"true","maxCollectionSize":"0","authIgnoreCerts":"false","agentId":"10.244.0.86-194-3b9a45b3-
jvm","debug":"false","agentType":"jvm","policyLocation":"classpath:\/jolokia-access.xml","agentContext":"
\/jolokia","serializeException":"false","mimeType":"text
\/plain","maxDepth":"15","authMode":"basic","authMatch":"any","discoveryEnabled":"true","streaming":"true","canonicalNaming":"t
rue","historyMaxEntries":"10","allowErrorDetails":"true","allowDnsReverseLookup":"true","realm":"jolokia","includeStackTrace":"tru
e","maxObjects":"0","useRestrictorService":"false","debugMaxEntries":"100"},"info":{}},"timestamp":1609855902,"status":200}
What does your Medusa config look like? You need to have the following in your config in order for Medusa to make the REST api calls instead of trying to invoke nodetool
:
[kubernetes]
enabled = 1
I’m trying to deploy it in a kubernetes cluster but I can’t
I am happy to help in any way I can :) If you would like to chat over Slack feel free to ping me on Apache Slack in either the #cassandra-kubernetes
or #cassandra-medusa
channels.
oh thank you very much ! The default config map from the helm chart doesn't include that bit you just shared
my medusa config is included in the first post:
[storage]
storage_provider = minio
bucket_name = cass
host = s3.tdf
region = us-east-1
secure = false
key_file = /etc/medusa/key.txt
max_backup_age = 0
max_backup_count = 0
transfer_max_bandwidth = 50MB/s
concurrent_transfers = 1
multi_part_upload_threshold = 104857600
[grpc]
enabled = 1
cassandra_url = http://localhost:7373/jolokia
[logging]
level = DEBUG
; now it includes
[kubernetes]
enabled = 1
So adding kubernetes went beyond the previous point, now I have another error:
requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
From the Medusa logs
File "/app/medusa/cassandra_utils.py", line 475, in __do_post
response = requests.post(self.kubernetes_config.cassandra_url, data=json_data)
I think the problem is that the cassandra_url
configuration needs to be moved from the [grpc]
section to the [kubernetes]
section of your medusa.ini. So like this:
[grpc]
enabled = 1
[logging]
level = DEBUG
; now it includes
[kubernetes]
enabled = 1
cassandra_url = http://localhost:7373/jolokia
Thanks, adding the following to medusa.ini
solved the problem
[kubernetes]
enabled = 1
cassandra_url = http://localhost:7373/jolokia
To the settings solved the problem completely.
Running normally in docker yields the error:
Docker command:
medusa.ini
```config [storage] storage_provider = minio bucket_name = cass host = s3.tdf region = us-east-1 secure = false key_file = /etc/medusa/key.txt max_backup_age = 0 max_backup_count = 0 transfer_max_bandwidth = 50MB/s concurrent_transfers = 1 multi_part_upload_threshold = 104857600 [grpc] enabled = 1 cassandra_url = http://localhost:7373/jolokia [logging] level = DEBUG ```Full log
``` MEDUSA_MODE = GRPC sleeping for 0 sec Starting Medusa gRPC service DEBUG:root:sleeping for 0 sec INFO:root:Init service [2021-01-05 11:56:34,483] INFO: Init service DEBUG:root:Reading AWS credentials from /etc/medusa/key.txt INFO:root:Starting server. Listening on port 50051. [2021-01-05 11:56:35,062] INFO: Starting server. Listening on port 50051. INFO:root:Performing backup test-dc1-2021-01-05 12:56:55.847591 [2021-01-05 11:56:55,853] INFO: Performing backup test-dc1-2021-01-05 12:56:55.847591 INFO:root:Monitoring provider is noop [2021-01-05 11:56:55,853] INFO: Monitoring provider is noop DEBUG:root:Reading AWS credentials from /etc/medusa/key.txt DEBUG:root:This server has systemd: False WARNING:root:is ccm : 0 [2021-01-05 11:56:56,186] WARNING: is ccm : 0 DEBUG:root:Blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql was not found in cache. DEBUG:root:[Storage] Getting object dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql DEBUG:root:Process psutil.Process(pid=9, name='python3', status='sleeping', started='11:56:32') was set to use only idle IO and CPU resources INFO:root:Saving tokenmap and schema [2021-01-05 11:56:56,300] INFO: Saving tokenmap and schema WARNING:cassandra.cluster:Downgrading core protocol version from 66 to 65 for 127.0.0.1:9042. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version [2021-01-05 11:56:56,306] WARNING: Downgrading core protocol version from 66 to 65 for 127.0.0.1:9042. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version WARNING:cassandra.cluster:Downgrading core protocol version from 65 to 4 for 127.0.0.1:9042. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version [2021-01-05 11:56:56,308] WARNING: Downgrading core protocol version from 65 to 4 for 127.0.0.1:9042. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version DEBUG:root:Checking datacenter... DEBUG:root:Resolved 127.0.0.1 to dataiku.localhost DEBUG:root:Checking host 127.0.0.1 against 127.0.0.1/dataiku.localhost DEBUG:root:Resolved 127.0.0.1 to dataiku.localhost DEBUG:root:Blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/tokenmap.json was not found in cache. DEBUG:root:[Storage] Getting object dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/tokenmap.json DEBUG:root:[Storage] Reading blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/tokenmap.json... DEBUG:root:[Storage] Reading blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/tokenmap.json... DEBUG:root:[Storage] Getting object dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql DEBUG:root:[Storage] Reading blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql... DEBUG:root:[Storage] Reading blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql... DEBUG:root:Blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql was not found in cache. DEBUG:root:[Storage] Getting object dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql DEBUG:root:Blob dataiku.localhost/test-dc1-2021-01-05 12:56:55.847591/meta/schema.cql last modification time is Tue, 05 Jan 2021 11:56:56 GMT DEBUG:root:[Storage] Getting object index/latest_backup/dataiku.localhost/backup_name.txt DEBUG:root:[Storage] Reading blob index/latest_backup/dataiku.localhost/backup_name.txt... DEBUG:root:[Storage] Reading blob index/latest_backup/dataiku.localhost/backup_name.txt... DEBUG:root:[Storage] Getting object dataiku.localhost/2020121514/meta/differential DEBUG:root:[Storage] Getting object dataiku.localhost/2020121514/meta/incremental DEBUG:root:Blob dataiku.localhost/2020121514/meta/schema.cql was not found in cache. DEBUG:root:[Storage] Getting object dataiku.localhost/2020121514/meta/schema.cql DEBUG:root:[Storage] Getting object dataiku.localhost/2020121514/meta/manifest.json DEBUG:root:[Storage] Reading blob dataiku.localhost/2020121514/meta/manifest.json... DEBUG:root:[Storage] Reading blob dataiku.localhost/2020121514/meta/manifest.json... INFO:root:Starting backup [2021-01-05 11:56:56,526] INFO: Starting backup INFO:root:Creating snapshot [2021-01-05 11:56:56,526] INFO: Creating snapshot DEBUG:root:Executing: nodetool snapshot -t medusa-test-dc1-2021-01-05 12:56:55.847591 You must set the CASSANDRA_CONF and CLASSPATH vars ERROR:root:This error happened during the backup: Command '['nodetool', 'snapshot', '-t', 'medusa-test-dc1-2021-01-05 12:56:55.847591']' returned non-zero exit status 1. [2021-01-05 11:56:56,536] ERROR: This error happened during the backup: Command '['nodetool', 'snapshot', '-t', 'medusa-test-dc1-2021-01-05 12:56:55.847591']' returned non-zero exit status 1. Traceback (most recent call last): File "/app/medusa/backup_node.py", line 207, in main cassandra, node_backup, storage, differential_mode, config, backup_name) File "/app/medusa/backup_node.py", line 254, in do_backup with cassandra.create_snapshot(backup_name) as snapshot: File "/app/medusa/cassandra_utils.py", line 440, in create_snapshot subprocess.check_call(cmd, stdout=subprocess.DEVNULL, universal_newlines=True) File "/usr/lib/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['nodetool', 'snapshot', '-t', 'medusa-test-dc1-2021-01-05 12:56:55.847591']' returned non-zero exit status 1. ```Don't pay much attention to the storage_provider, it doesn't seem relevant in this case, I've opened a PR #246 to include minio support.
I'm trying to make this work on k8ssandra but I don't know why I'm getting this error. I'm binding nodetool to the image because it couldn't find it.