datahub-project / datahub

The Metadata Platform for your Data Stack
https://datahubproject.io
Apache License 2.0
9.67k stars 2.86k forks source link

Cancelling a UI Ingestion does not really stop the running of ingestion queries #4765

Closed esselius closed 2 years ago

esselius commented 2 years ago

Describe the bug

When using the UI to perform postgres metadata ingestion and sql profiling, cancelling through UI does not cancel running queries DB queries or stop new queries from being sent to the database

Cancelling in UI + killing the datahub-actions k8s pod + manuall cancelling queries works, but shouldn't be necessary

To Reproduce Steps to reproduce the behavior:

  1. Set up postgres + sql profiling ingestion through UI with a slow DB
  2. Execute ingestion source
  3. Cancel ingestion
  4. See new queries still being run with:
SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>'
  AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;

Expected behavior Cancelling ingestion aborts all ingestion queries, or at least does not keep sending new queries

Datahub v0.8.33

jjoyce0510 commented 2 years ago

It should issue a kill to the ingestion process, which should listen. If it does not it is most likely that the ingestion framework is doing some parallel stuff inside to issue queries. Does it complete the entire ingestion? Or just run a few queries and then die?

esselius commented 2 years ago

I let it keep running for 15min before killing it, but it does keep firing new queries and not just wait for the already started ones to finish

I want to check if it keeps submitting metadata after cancellation, but not sure how

jjoyce0510 commented 2 years ago

Hi @esselius - Are you still having the issue? We've not heard other such reports.

Given the time frame, I'm going to close the ticket due to inactivity.

Please feel free to reopen if this is still an issue for you.

Thanks, John