The EXTRA_DATA and EXTRA_ELEMENT data returned by Ganglia is unused in many code paths, so by stripping it we can see XML parsing performance gains of about 50% for large data sets.
In production, gmetad was returning 26MB of XML data for the everything path. By filtering this subset of data out, it took parsing time from an average of 2.1s to 1.1s, a nearly 50% improvement.
By looking at the start_* functions, we can see if they use the EXTRA_DATA information at all. Since most of them don't, we can safely strip out the information.
The EXTRA_DATA and EXTRA_ELEMENT data returned by Ganglia is unused in many code paths, so by stripping it we can see XML parsing performance gains of about 50% for large data sets.
In production, gmetad was returning 26MB of XML data for the everything path. By filtering this subset of data out, it took parsing time from an average of 2.1s to 1.1s, a nearly 50% improvement.
By looking at the start_* functions, we can see if they use the EXTRA_DATA information at all. Since most of them don't, we can safely strip out the information.