spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
https://dataflow.spring.io
Apache License 2.0
1.11k stars 580 forks source link

Destroying a task definition won't delete related execution records #4091

Closed johnlinp closed 3 years ago

johnlinp commented 4 years ago

Description: When destroying a task definition, I expect the related execution records to be deleted too. However, it seems that's not the case.

Release versions: 2.6.0

Steps to reproduce:

  1. Create a task definition
    dataflow:>task create --name some-task --definition timestamp
    Created new task 'some-task'
  2. Launch the task
    dataflow:>task launch --name some-task
    Launched task 'some-task' with execution id 4
  3. See the task completed
    dataflow:>task list
    ╔═════════╤═══════════════╤═══════════╤═══════════╗
    ║Task Name│Task Definition│description│Task Status║
    ╠═════════╪═══════════════╪═══════════╪═══════════╣
    ║some-task│timestamp      │           │COMPLETE   ║
    ╚═════════╧═══════════════╧═══════════╧═══════════╝
  4. Destroy the task
    dataflow:>task destroy some-task
    Destroyed task 'some-task'
  5. Re-create the task with the same name
    dataflow:>task create --name some-task --definition timestamp
    Created new task 'some-task'
  6. See the task is already completed
    dataflow:>task list
    ╔═════════╤═══════════════╤═══════════╤═══════════╗
    ║Task Name│Task Definition│description│Task Status║
    ╠═════════╪═══════════════╪═══════════╪═══════════╣
    ║some-task│timestamp      │           │COMPLETE   ║
    ╚═════════╧═══════════════╧═══════════╧═══════════╝
sabbyanandan commented 4 years ago

Hi, @johnlinp. The behavior you're seeing is a design decision by choice. The primary reason being able to traverse through and explore the historical executions for auditing purposes. It is more relevant for batch-job executions, too.

We have an open issue (contributions welcome!) to add the on-demand option to clean up the executions either individually or as a whole. There are, of course, repercussions to doing it if someone accidentally cleans them up.

All that said, the fact that you're seeing the task status for a re-created task definition, that seems odd. @mminella: Perhaps the CD flow is wrongly connecting to the previously executed task's status?

mminella commented 4 years ago

All of the statuses are based on name, so if the name is recycled, then that is a risk. We could explore making names globally unique but what would have wide ranging impacts (CTR, properties, etc).

sabbyanandan commented 3 years ago

Addressed via: https://github.com/spring-cloud/spring-cloud-dataflow/issues/3902.