Closed godber closed 1 year ago
I had an issue running teraslice in minikube. Possibly related to the version of k8s I'm using. Here are the teraslice-master logs. I've removed or abbreviated want didn't look relevant to me.
kubectl -n ts-dev1 logs teraslice-master-6f65f6bcc4-5mt99 | bunyan
[2023-10-04T22:26:45.037Z] INFO: teraslice/7 on teraslice-master-6f65f6bcc4-5mt99: Service starting (assignment=node_master)
...
(skipping setup, ES, asset deployment, etc)
...
[2023-10-04T22:34:14.911Z] DEBUG: example-data-generator-job/14 on teraslice-master-6f65f6bcc4-5mt99: enqueueing execution to be processed (queue size 0) (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_, active=true, analytics=true, performance_metrics=false, autorecover=false, lifecycle=once, max_retries=3, probation_window=300000, slicers=1, workers=2, stateful=false, labels=null, env_vars={}, targets=[], ephemeral_storage=true, pod_spec_override={}, volumes=[], job_id=5aee9a70-65ea-47ed-be53-ead76c096b4d, _context=ex, _created=2023-10-04T22:34:14.895Z, _updated=2023-10-04T22:34:14.895Z, ex_id=d5222f35-ad6a-4269-8b4e-e846636e44d4, metadata={}, _status=pending, _has_errors=false, _slicer_stats={}, _failureReason="")
assets: [
"65ee07b97850ce15e78068224febff5c5deb9ae7",
"2b4f08ae993293c44af418d2f9ec3746d98039ae",
"00183d8c533503f4acba7aa001931563a001791f"
]
--
operations: [
{
"_op": "data_generator",
"_encoding": "json",
"_dead_letter_action": "throw",
"json_schema": null,
"size": 5000000,
"start": null,
"end": null,
"format": null,
"stress_test": false,
"date_key": "created",
"set_id": null,
"id_start_key": null
},
{
"_op": "example",
"_encoding": "json",
"_dead_letter_action": "none",
"type": "string"
},
{
"_op": "delay",
"_encoding": "json",
"_dead_letter_action": "throw",
"ms": 30000
},
{
"_op": "elasticsearch_bulk",
"_encoding": "json",
"_dead_letter_action": "throw",
"size": 5000,
"connection": "default",
"index": "terak8s-example-data",
"type": "events",
"delete": false,
"update": false,
"update_retry_on_conflict": 0,
"update_fields": [],
"upsert": false,
"create": false,
"script_file": "",
"script": "",
"script_params": {},
"api_name": "elasticsearch_sender_api"
}
]
--
apis: [
{
"_name": "elasticsearch_sender_api",
"_encoding": "json",
"_dead_letter_action": "throw",
"size": 5000,
"connection": "default",
"index": "terak8s-example-data",
"type": "events",
"delete": false,
"update": false,
"update_retry_on_conflict": 0,
"update_fields": [],
"upsert": false,
"create": false,
"script_file": "",
"script": "",
"script_params": {},
"_op": "elasticsearch_bulk"
}
]
[2023-10-04T22:34:15.230Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Scheduling execution: d5222f35-ad6a-4269-8b4e-e846636e44d4 (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:34:15.273Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution allocating slicer (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_, apiVersion=batch/v1, kind=Job)
metadata: {
...
}
--
spec: {
...
}
[2023-10-04T22:34:15.284Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s slicer job submitted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_, kind=Job, apiVersion=batch/v1, status={})
metadata: {
...
}
--
spec: {
...
}
[2023-10-04T22:34:15.289Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 10 times)
...
[2023-10-04T22:34:25.673Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution d5222f35-ad6a-4269-8b4e-e846636e44d4 is connected (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:34:26.375Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 48 more times)
...
[2023-10-04T22:35:16.021Z] WARN: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Failed to provision execution d5222f35-ad6a-4269-8b4e-e846636e44d4 (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.059Z] WARN: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Calling stopExecution on execution: d5222f35-ad6a-4269-8b4e-e846636e44d4 to clean up k8s resources. (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.063Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 execution_controller jobs deleting: ts-exc-example-data-generator-job-5aee9a70-65ea (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.072Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 worker deployments has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.205Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution d5222f35-ad6a-4269-8b4e-e846636e44d4 finished, shutting down execution (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.209Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 execution_controller jobs has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.212Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 worker deployments has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.297Z] INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: client d5222f35-ad6a-4269-8b4e-e846636e44d4 disconnected { reason: 'client namespace disconnect' } (assignment=cluster_master, module=messaging:server, worker_id=WvOOtRs_)
Starting my Minikube cluster with version 1.23.17 resolves the issue.
minikube start --memory 4096 --cpus 4 --kubernetes-version=v1.23.17
This is the log I was hoping for, thanks Peter:
[2023-10-04T22:34:15.289Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 10 times)
Maybe we should replicate what we do for e2e
tests here:
Make a new type k8se2e
, implement a runk8sE2eTest
function similar to the runE2ETest
function, then go from there.
@jsnoble suggested parameterizing the existing e2e
tests so that the existing jests tests can be reused. This is a great suggestion since there's no real need to implement separate teraslice tests. There may be subsets of tests that are "platform" (native clustering vs k8s clustering) specific, we'd have to have a way to omit the other ones in each case.
I suggested adding something like a platform
option to the test options type. That can be used to start up the services in the right spot (k8s vs docker), launch teraslice the right way and omit platform specific tests.
Initial pass/fail of all e2e tests using k8s:
Test Suites: 4 failed, 1 skipped, 6 passed, 10 of 11 total
PASS e2e test/cases/data/elasticsearch-bulk-spec.js (10.747 s)
PASS e2e test/cases/validation/job-spec.js (8.55 s)
PASS e2e test/cases/cluster/job-state-spec.js (25.801 s)
PASS e2e test/cases/cluster/api-spec.js (47.587 s)
PASS e2e test/cases/data/reindex-spec.js (67.138 s)
PASS e2e test/cases/assets/simple-spec.js (99.074 s)
FAIL e2e test/cases/cluster/worker-allocation-spec.js (17.359 s)
FAIL e2e test/cases/data/recovery-spec.js
FAIL e2e test/cases/cluster/state-spec.js (12.324 s)
FAIL e2e test/cases/kafka/kafka-spec.js (122.693 s)
We'll want to get the k8s e2e tests running in Github Actions for node 14 and node 16, as well as the supported versions of Elastic/OpenSearch.
ref: #3449
ref: #3454 Second try
This is done.
I have been performing acceptance testing manually for all of my changes to Teraslice in Kubernetes. We should implement at least one simple e2e like test in KIND (Docker) in CI.
The tests will have to:
https://kind.sigs.k8s.io/