open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.65k stars 1.06k forks source link

Metadata Backup Memory Exhaustion Issue #13660

Closed Jason-Clark-FG closed 1 year ago

Jason-Clark-FG commented 1 year ago

Affected module

The error is in the metadata backup process, but it affects everything as it consumes all the memory until it is killed by the oom-kill kernel proces. Meanwhile it causes containers to become unhealthy, and some do not recover nicely, especially the ingestion container.

Describe the bug

Excessive memory consumption by the metadata backup process, which eventually gets killed and never completes. Even after truncating the client_event table.

To Reproduce

Expected behavior

The backup process runs to completion without consuming excessive memory, running containers remain healthy and running.

Version:

Additional context

There's been some discussion about it here: https://openmetadata.slack.com/archives/C02B6955S4S/p1696614254906349

Jason-Clark-FG commented 1 year ago

@pmbrull Thanks for the fix, I just cleared out and installed the latest python modules under the user and ran the backup. It still runs out of memory and gets killed by the oom-kill process.

pip freeze --user > uninstall_list.txt;pip uninstall -y -r uninstall_list.txt
pip --disable-pip-version-check install --user --no-warn-script-location --upgrade openmetadata-ingestion[backup,mysql]~=1.2.0

Python Modules:

$ pip freeze --user
antlr4-python3-runtime==4.9.2
appdirs==1.4.4
avro==1.11.3
azure-core==1.29.1
azure-identity==1.15.0
azure-storage-blob==12.19.0
beautifulsoup4==4.12.2
boto3==1.29.1
botocore==1.32.1
cached-property==1.5.2
cachetools==5.3.2
collate-sqllineage==1.1.5
croniter==1.3.15
diff_cover==8.0.1
ecdsa==0.18.0
email-validator==2.1.0.post1
exceptiongroup==1.1.3
google==3.0.0
google-auth==2.23.4
greenlet==3.0.1
grpcio==1.59.2
grpcio-tools==1.59.2
idna==2.10
importlib-metadata==6.8.0
iniconfig==2.0.0
isodate==0.6.1
jmespath==1.0.1
memory-profiler==0.61.0
msal==1.25.0
msal-extensions==1.0.0
mypy-extensions==1.0.0
networkx==3.2.1
openmetadata-ingestion==1.2.1.1
packaging==23.2
pathspec==0.11.2
pluggy==1.3.0
portalocker==2.8.2
protobuf==4.25.0
psutil==5.9.6
pydantic==1.10.13
PyMySQL==1.1.0
pytest==7.4.3
python-dateutil==2.8.2
python-jose==3.3.0
PyYAML==6.0.1
regex==2023.10.3
requests-aws4auth==1.2.3
rsa==4.9
s3transfer==0.7.0
soupsieve==2.5
SQLAlchemy==1.4.50
sqlfluff==2.1.4
sqlparse==0.4.3
tabulate==0.9.0
tblib==3.0.0
toml==0.10.2
tomli==2.0.1
tqdm==4.66.1
typing-compat==0.1.0
typing-inspect==0.9.0
typing_extensions==4.5.0

I see the number of rows read are being limited, do we need to clear them out of memory before the next read or something?

TIA