Parse Streaming JSON from large HTTP Response Bodies

ilackarms commented 7 years ago

Response bodies on List objects returned by Kubernetes/Openshift can be very large; the current implementation of Kubeclient::Client.get_entities requires loading the entire response body into memory before deserializing.

This is a proposal to parse objects from lists (whose nesting level within the JSON body can always be predicted) from streaming HTTP response body in chunks rather than waiting for the entire body to be read into memory.

Changes used in the PoC are available here: https://github.com/abonas/kubeclient/compare/master...ilackarms:streamingjson

Testing the memory usage of the modified get_entities against varying quantities of Kubernetes API objects (in this instance, namespaces):

Results measured using memusg and this script

ilackarms commented 7 years ago

@simon3z adding you to the thread

simon3z commented 7 years ago

Thanks @ilackarms !

ilackarms commented 7 years ago

Here are updated measurements taken based off of #254

@agrare

ManageIQ / kubeclient

Parse Streaming JSON from large HTTP Response Bodies #250