apache / dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
https://dolphinscheduler.apache.org/
Apache License 2.0
12.66k stars 4.57k forks source link

[DSIP-52][Task] Add single task integration test #16231

Closed pegasas closed 2 months ago

pegasas commented 2 months ago

Search before asking

Motivation

Now api-test has directly run api-test towards spring multiple embedded service, include but not limit to api/master/worker. To improve our task robustness and make it easy to integration test, I suggest we can split dependencies from WorkerTaskExecutor, which has use strategy pattern to manage task life cycle, only keep slf4j logger & taskexecutioncontext for running.

then we can easily directly run WorkerTaskExecutor as a runnable.

Design Detail

Keep WorkerTaskExecutor dependencies as small as possible. Stage #1: use TenantConfig instead of WorkerConfig for better ingestion. Stage #2: use Observer/Listener pattern to move messagesender(for retry), storageoperate(for task logging & remote logging) and workerRegistryClient(for getting master/alert host) into outside to decouple.

Compatibility, Deprecation, and Migration Plan

refactor & add CI

Test Plan

add CI towards WorkerTaskExecutor, which is a simple runnable.

Code of Conduct

SbloodyS commented 2 months ago

The api-test and e2e conducts end-to-end testing from the back end and the front end respectively. We shouldn't invade the original code too much just because of the tests. Code optimization and testing are two separate issues.

So I'm -1 on this. cc @ruanwenjun

pegasas commented 2 months ago

The api-test and e2e conducts end-to-end testing from the back end and the front end respectively. We shouldn't invade the original code too much just because of the tests. Code optimization and testing are two separate issues.

So I'm -1 on this. cc @ruanwenjun

just like https://github.com/apache/dolphinscheduler/pull/15861 In fact we do not invade the original code too much and just keep WorkerTaskExecutor as a runnable, and move dependencies using observer/lister pattern like quartz.

image

you will realize that remote-logging/alert code can be move to listener part, which will not be executed when we directly runningWorkerTaskExecutor directly and easy to extend if we wants to add more listener towards task life-cycle.

I think the cost is very less and the return is large.

SbloodyS commented 2 months ago

This PR https://github.com/apache/dolphinscheduler/pull/15861 is mostly bugfix, while adding tests for cluster deployment. Even without this test, the code remains the same

pegasas commented 2 months ago

This PR ##15861 is mostly bugfix, while adding tests for cluster deployment. Even without this test, the code remains the same

sorry to confuse, the key point of the PR is to add registry related integration to improve robustness.

https://github.com/apache/dolphinscheduler/pull/15861/files#diff-4221e7060cb5692b5e8656f8f092468895f4bc8804397e8de7cb4cfdf8d95024

SbloodyS commented 2 months ago

sorry to confuse, the key point of the PR is to add registry related integration to improve robustness.

https://github.com/apache/dolphinscheduler/pull/15861/files#diff-4221e7060cb5692b5e8656f8f092468895f4bc8804397e8de7cb4cfdf8d95024

Based on the modification of #16233, I don't think this makes any sense in terms of code optimization or unit testing.

pegasas commented 2 months ago

sorry to confuse, the key point of the PR is to add registry related integration to improve robustness. https://github.com/apache/dolphinscheduler/pull/15861/files#diff-4221e7060cb5692b5e8656f8f092468895f4bc8804397e8de7cb4cfdf8d95024

Based on the modification of #16233, I don't think this makes any sense in terms of code optimization or unit testing.

This is only first stage.

first stage we only kept tenantconfig instead of workerconfig, which will make it easyier to build only use enable/disable properties in it. second stage I will move out remote-logging/alert part using observer/listener pattern is another PR.

if you not bother, we can sync offline if you are not mind.

SbloodyS commented 2 months ago

Now api-test has directly run api-test towards spring multiple embedded service, include but not limit to api/master/worker. To improve our task robustness and make it easy to integration test

According to your description, you want to optimize api-test to make it more robust and convenient. But I didn't see anything relevant in either of the two stages you described, or in PR.

pegasas commented 2 months ago

Now api-test has directly run api-test towards spring multiple embedded service, include but not limit to api/master/worker. To improve our task robustness and make it easy to integration test

According to your description, you want to optimize api-test to make it more robust and convenient. But I didn't see anything relevant in either of the two stages you described, or in PR.

Thanks @SbloodyS and @ruanwenjun for guiding offline. It seems we should keep api-test standard to make each flow aligned with user's experience.