suds-community / suds

Suds is a lightweight SOAP python client for consuming Web Services. A community fork of the jurko fork.
https://suds.readthedocs.io/
GNU Lesser General Public License v3.0
173 stars 56 forks source link

Performance on large wsdls #10

Open phillbaker opened 5 years ago

phillbaker commented 5 years ago

Originally opened in https://bitbucket.org/jurko/suds/issues/9/profiling-suds, it seems that suds' performance can be improved when loading large WSDLs.

Some examples of work to this end are:

chris-griffin commented 5 years ago

https://github.com/liboz/suds-lxml as well

ovnicraft commented 5 years ago

Between all repo mentions, whats the best option ?

phillbaker commented 5 years ago

@ovnicraft I pulled several of these commits into a branch here: https://github.com/suds-community/suds/tree/perf-wsdls

However, during testing I didn't notice any dramatic improvements - would love some help evaluating and testing.

phillbaker commented 5 years ago

To add a bit more context, most of those commits seem related to performance in the request/response cycle. The main issue we're seeing is on client boot.

There isn't much documentation on the differences in the cachingpolicy (introduced https://github.com/suds-community/suds/commit/b3d6d183ddca74ee6e07d1c6d51cf88554bc540c), however, in doing some basic tracing of client boot time:

c = Client(url,cachingpolicy=0) # warmed cache documents.parsed 6.985664367675781e-05 open_imports 7.152557373046875e-06 resolve 0.047647953033447266 documents.parsed 3.910064697265625e-05 documents.parsed 4.1961669921875e-05 documents.parsed 3.695487976074219e-05 documents.parsed 2.9087066650390625e-05 build_schema 9.905397891998291 set_wrapped 0.1327660083770752 add_methods 0.039650917053222656 self.fn 13.933005094528198 Factory 3.600120544433594e-05 ServiceSelector 6.198883056640625e-06 ServiceDefinition 0.2581620216369629


* With `cachingpllicy=1`

c = SudsClient(url,cachingpolicy=1) # first load document.loaded 0.0001628398895263672 sax.parse 1.9437551498413086 documents.parsed 9.799003601074219e-05 open_imports 2.86102294921875e-06 resolve 0.024981021881103516 document.loaded 4.696846008300781e-05 sax.parse 0.012939929962158203 documents.parsed 2.6941299438476562e-05 document.loaded 3.886222839355469e-05 sax.parse 3.364743947982788 documents.parsed 5.0067901611328125e-05 document.loaded 5.698204040527344e-05 sax.parse 0.9885931015014648 documents.parsed 4.291534423828125e-05 document.loaded 5.3882598876953125e-05 sax.parse 0.013521909713745117 documents.parsed 2.7894973754882812e-05 build_schema 14.720332860946655 set_wrapped 0.13639307022094727 add_methods 0.03350090980529785 self.fn 23.030166149139404 Factory 5.602836608886719e-05 ServiceSelector 5.0067901611328125e-06 ServiceDefinition 0.21198701858520508

c = Client(url,cachingpolicy=1) # warmed cache cache.get 8.078610181808472 Factory 2.7894973754882812e-05 ServiceSelector 5.0067901611328125e-06 ServiceDefinition 0.24859380722045898


What's surprising is that bumping the protocol version, doesn't seem to have an effect. It seems to be set at python 2's max value: 

https://github.com/suds-community/suds/blob/6fb0a829337b5037a66c20aae6f89b41acd77e40/suds/cache.py#L312

On python 3 using `pickle.HIGHEST_PROTOCOL`, with a warmed cache, this is an average load:

c = Client(url, cachingpolicy=1) cache.get 8.570900917053223 Factory 3.1948089599609375e-05 ServiceSelector 6.9141387939453125e-06 ServiceDefinition 0.23836612701416016


However, reducing garbage collection during the unpickling did seem to have a positive effect, dramatically reducing load time, along the lines of:

import gc

disable garbage collector

gc.disable() cache.get(...)

enable garbage collector again

gc.enable()