terascope / teraslice

Scalable data processing pipelines in JavaScript
https://terascope.github.io/teraslice/
Apache License 2.0
50 stars 13 forks source link

Implement KIND based e2e test for Teraslice in Kubernetes #3427

Closed godber closed 1 year ago

godber commented 1 year ago

I have been performing acceptance testing manually for all of my changes to Teraslice in Kubernetes. We should implement at least one simple e2e like test in KIND (Docker) in CI.

The tests will have to:

https://kind.sigs.k8s.io/

busma13 commented 1 year ago

I had an issue running teraslice in minikube. Possibly related to the version of k8s I'm using. Here are the teraslice-master logs. I've removed or abbreviated want didn't look relevant to me.


kubectl -n ts-dev1 logs teraslice-master-6f65f6bcc4-5mt99 | bunyan
[2023-10-04T22:26:45.037Z]  INFO: teraslice/7 on teraslice-master-6f65f6bcc4-5mt99: Service starting (assignment=node_master)
...
(skipping setup, ES, asset deployment, etc)
...
[2023-10-04T22:34:14.911Z] DEBUG: example-data-generator-job/14 on teraslice-master-6f65f6bcc4-5mt99: enqueueing execution to be processed (queue size 0) (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_, active=true, analytics=true, performance_metrics=false, autorecover=false, lifecycle=once, max_retries=3, probation_window=300000, slicers=1, workers=2, stateful=false, labels=null, env_vars={}, targets=[], ephemeral_storage=true, pod_spec_override={}, volumes=[], job_id=5aee9a70-65ea-47ed-be53-ead76c096b4d, _context=ex, _created=2023-10-04T22:34:14.895Z, _updated=2023-10-04T22:34:14.895Z, ex_id=d5222f35-ad6a-4269-8b4e-e846636e44d4, metadata={}, _status=pending, _has_errors=false, _slicer_stats={}, _failureReason="")
    assets: [
      "65ee07b97850ce15e78068224febff5c5deb9ae7",
      "2b4f08ae993293c44af418d2f9ec3746d98039ae",
      "00183d8c533503f4acba7aa001931563a001791f"
    ]
    --
    operations: [
      {
        "_op": "data_generator",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "json_schema": null,
        "size": 5000000,
        "start": null,
        "end": null,
        "format": null,
        "stress_test": false,
        "date_key": "created",
        "set_id": null,
        "id_start_key": null
      },
      {
        "_op": "example",
        "_encoding": "json",
        "_dead_letter_action": "none",
        "type": "string"
      },
      {
        "_op": "delay",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "ms": 30000
      },
      {
        "_op": "elasticsearch_bulk",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "size": 5000,
        "connection": "default",
        "index": "terak8s-example-data",
        "type": "events",
        "delete": false,
        "update": false,
        "update_retry_on_conflict": 0,
        "update_fields": [],
        "upsert": false,
        "create": false,
        "script_file": "",
        "script": "",
        "script_params": {},
        "api_name": "elasticsearch_sender_api"
      }
    ]
    --
    apis: [
      {
        "_name": "elasticsearch_sender_api",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "size": 5000,
        "connection": "default",
        "index": "terak8s-example-data",
        "type": "events",
        "delete": false,
        "update": false,
        "update_retry_on_conflict": 0,
        "update_fields": [],
        "upsert": false,
        "create": false,
        "script_file": "",
        "script": "",
        "script_params": {},
        "_op": "elasticsearch_bulk"
      }
    ]
[2023-10-04T22:34:15.230Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Scheduling execution: d5222f35-ad6a-4269-8b4e-e846636e44d4 (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:34:15.273Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution allocating slicer (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_, apiVersion=batch/v1, kind=Job)
    metadata: {
        ...
    }
     --
    spec: {
        ...
    }
[2023-10-04T22:34:15.284Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s slicer job submitted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_, kind=Job, apiVersion=batch/v1, status={})
    metadata: {
        ...
    }
     --
    spec: {
        ...
    }
[2023-10-04T22:34:15.289Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 10 times)
...
[2023-10-04T22:34:25.673Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution d5222f35-ad6a-4269-8b4e-e846636e44d4 is connected (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:34:26.375Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 48 more times)
...
[2023-10-04T22:35:16.021Z]  WARN: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Failed to provision execution d5222f35-ad6a-4269-8b4e-e846636e44d4 (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.059Z]  WARN: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Calling stopExecution on execution: d5222f35-ad6a-4269-8b4e-e846636e44d4 to clean up k8s resources. (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.063Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 execution_controller jobs deleting: ts-exc-example-data-generator-job-5aee9a70-65ea (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.072Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 worker deployments has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.205Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution d5222f35-ad6a-4269-8b4e-e846636e44d4 finished, shutting down execution (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.209Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 execution_controller jobs has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.212Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 worker deployments has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.297Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: client d5222f35-ad6a-4269-8b4e-e846636e44d4 disconnected { reason: 'client namespace disconnect' } (assignment=cluster_master, module=messaging:server, worker_id=WvOOtRs_)
busma13 commented 1 year ago

Starting my Minikube cluster with version 1.23.17 resolves the issue. minikube start --memory 4096 --cpus 4 --kubernetes-version=v1.23.17

godber commented 1 year ago

This is the log I was hoping for, thanks Peter:

[2023-10-04T22:34:15.289Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 10 times)
godber commented 1 year ago

Maybe we should replicate what we do for e2e tests here:

https://github.com/terascope/teraslice/blob/master/packages/scripts/src/helpers/test-runner/index.ts#L62-L65

Make a new type k8se2e, implement a runk8sE2eTest function similar to the runE2ETest function, then go from there.

godber commented 1 year ago

@jsnoble suggested parameterizing the existing e2e tests so that the existing jests tests can be reused. This is a great suggestion since there's no real need to implement separate teraslice tests. There may be subsets of tests that are "platform" (native clustering vs k8s clustering) specific, we'd have to have a way to omit the other ones in each case.

I suggested adding something like a platform option to the test options type. That can be used to start up the services in the right spot (k8s vs docker), launch teraslice the right way and omit platform specific tests.

busma13 commented 1 year ago

Initial pass/fail of all e2e tests using k8s:

Test Suites: 4 failed, 1 skipped, 6 passed, 10 of 11 total

 PASS   e2e  test/cases/data/elasticsearch-bulk-spec.js (10.747 s)   
 PASS   e2e  test/cases/validation/job-spec.js (8.55 s)
 PASS   e2e  test/cases/cluster/job-state-spec.js (25.801 s)
 PASS   e2e  test/cases/cluster/api-spec.js (47.587 s)
 PASS   e2e  test/cases/data/reindex-spec.js (67.138 s)
 PASS   e2e  test/cases/assets/simple-spec.js (99.074 s)

 FAIL   e2e  test/cases/cluster/worker-allocation-spec.js (17.359 s)
 FAIL   e2e  test/cases/data/recovery-spec.js
 FAIL   e2e  test/cases/cluster/state-spec.js (12.324 s)
 FAIL   e2e  test/cases/kafka/kafka-spec.js (122.693 s)
godber commented 1 year ago

We'll want to get the k8s e2e tests running in Github Actions for node 14 and node 16, as well as the supported versions of Elastic/OpenSearch.

busma13 commented 1 year ago

ref: #3449

busma13 commented 1 year ago

ref: #3454 Second try

godber commented 1 year ago

This is done.