This huge(!) diff rewrites the GML rendering to give better performance.
Writing 5000 records with 30 fields means some methods got called 150.000 times. Python isn't so good at handling that many method calls, so anything that reduced the number of calls clearly showed improvements.
A test with using lxml as writer actually turned out to be slower, not faster. This is likely due to the amount of namespace handling and additional Python-logic around the C-API calls that construct the object tree. In this project we already know exactly which strings need to be generated, which is hard to beat with a generic solution.
The following improvements were made using django-silk profiling to find hotspots:
Using operator.attrgetter() instead of getattr() for value retrieval.
Reorganizing the GML rendering to use less method calls, including needing calls to super().
Reducing isinstance() usage (had more than 2.260.000 calls)
Write directly to the buffer, instead of concatenating strings all the time.
Removed the custom buffer class, it was no longer needed.
A performance check with ab -n10 -c1 on BAG panden shows:
This huge(!) diff rewrites the GML rendering to give better performance.
Writing 5000 records with 30 fields means some methods got called 150.000 times. Python isn't so good at handling that many method calls, so anything that reduced the number of calls clearly showed improvements.
A test with using lxml as writer actually turned out to be slower, not faster. This is likely due to the amount of namespace handling and additional Python-logic around the C-API calls that construct the object tree. In this project we already know exactly which strings need to be generated, which is hard to beat with a generic solution.
The following improvements were made using django-silk profiling to find hotspots:
operator.attrgetter()
instead ofgetattr()
for value retrieval.super()
.isinstance()
usage (had more than 2.260.000 calls)A performance check with
ab -n10 -c1
on BAG panden shows:For AB#103331