The memory used by pyff is properly freed up after a request finishes.
Current Behavior
Each request that leads to a 500 HTTP error results in a memory increase by 300MB.
Possible Solution
To alleviate the issue the parsed tree needs to be cleared explicitly as shown in the diff below.
diff --git i/src/pyff/api.py w/src/pyff/api.py
index 1050efb..2f17438 100644
--- i/src/pyff/api.py
+++ w/src/pyff/api.py
@@ -4,6 +4,7 @@ from datetime import datetime, timedelta
from json import dumps
from typing import Any, Dict, Generator, Iterable, List, Mapping, Optional, Tuple
+import lxml.etree
import pkg_resources
import pyramid.httpexceptions as exc
import pytz
@@ -297,12 +298,18 @@ def process_handler(request: Request) -> Response:
except ResourceException as ex:
import traceback
+ if isinstance(r, (lxml.etree._Element, lxml.etree._ElementTree)):
+ r.clear()
+
log.debug(traceback.format_exc())
log.warning(f'Exception from processing pipeline: {ex}')
raise exc.exception_response(409)
except BaseException as ex:
import traceback
+ if isinstance(r, (lxml.etree._Element, lxml.etree._ElementTree)):
+ r.clear()
+
log.debug(traceback.format_exc())
log.error(f'Exception from processing pipeline: {ex}')
raise exc.exception_response(500)
Steps to Reproduce
XML files which are stored under tmp/dynamic are 50MB in total in our case and that seems to lead to higher memory usage since pyff parses them into Python representation using lxml. Each request results roughly in a 300MB increase in memory which is not then freed up properly.
To reproduce the issue use the following pipeline file:
Code Version
2.1.2
Expected Behavior
The memory used by
pyff
is properly freed up after a request finishes.Current Behavior
Each request that leads to a 500 HTTP error results in a memory increase by 300MB.
Possible Solution
To alleviate the issue the parsed tree needs to be cleared explicitly as shown in the diff below.
Steps to Reproduce
XML files which are stored under
tmp/dynamic
are 50MB in total in our case and that seems to lead to higher memory usage sincepyff
parses them into Python representation usinglxml
. Each request results roughly in a 300MB increase in memory which is not then freed up properly.To reproduce the issue use the following pipeline file:
Run
pyff
with caching disabled:And run the following:
High memory consumption is most likely related to
lxml
not freeing up the memory properly.