mavak / trucov

True coverage tool for C / C++
1 stars 1 forks source link

Double performance by using boost.filesystem. #162

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Tru_utility does lots of string copying and conversion between std::string
and boost::filesystem::path.

A careful refactor of the code dealing with paths could double performance.

In particular Tru_utility::get_filename eats a few cycles.

Original issue reported on code.google.com by j.nick.terry@gmail.com on 2 Mar 2010 at 2:51

GoogleCodeExporter commented 9 years ago
Note: It will probably only double performance in regards to the use of the 
strings
converting to filepaths, however will probably not double performance overall.
Nonetheless, this should be looked into. 

Yes, I am back..

Original comment by millerlyte87@gmail.com on 12 May 2010 at 5:46

GoogleCodeExporter commented 9 years ago
Great to see you back.

Your right, I only meant to say it would double the performance of the pathing 
code,
not the over all execution time.  However the pathing code does eat up quite a 
bit of
our cpu time.  In particular, we spend 25% of the time in malloc and new, and 
half of
that time is coming from std::string function.

Original comment by j.nick.terry@gmail.com on 14 May 2010 at 5:56

GoogleCodeExporter commented 9 years ago
If think that the boost filesystem is only half the battle. After looking at 
some of the code, I've determined the following:

Spirit Parser
  a) String primitive 
    1) Creates a string and pushes back every time a character is read for the string (very slow). 
    2) Assigns the global string to from the primitive to a string object in the grammar object (1 string object copy, but the destination string is reused, thus more allocation is not required, thus fast).
    2) Then passes the string in grammar object by reference to parser builder (fast)
Parser Builder
  b) store_record
    1) Passed the strings from spirit, demangles it, and then reassigns it to
       an internal string (one internal string allocation, slow)
    2) Strings for paths are then stored in the data structure. Later to be used with the utility methods for paths, which often create boost::filesystem paths or do pass by value (very very slow). 

1st Half)
The boost spirit framework uses the string primitive to parse strings, it 
actually starts by creating a temporary string and pushing the characters one 
at a time. Thus if the average string size is say, 20 characters, then the 
internal string buffer will resize 3 or 4 times. Every string read from the 
gcno and gcda files must go through this process.

1st Solution) Create an memory pool system where a list of free char buffers 
are allocated at startup and allocated in large groups when more are needed. 
Everything read from the spirit string primitive will use these pooled buffers 
instead of strings.

2nd Half)
The more obvious, is the filesystem enhancement. There is way too much string 
copying going on after paths have been stored in the data structure. 

2nd Solution) In the ParserBuilder class, after the 1st solution has been 
implemented, Spirit will pass char buffers to us. From there we can allocate 
the Boost Filesystem paths directly from the char buffers and all the later 
string copies will become unnecessary. Then of course, remember to free the 
pooled buffers :)

--
I don't have the time profiler output like you Nick. The 1st half is only a 
theory, do you see any evidence from the profiler that supports this claim? 
i.e. if you see alot a time in string.push_back(), then the 1st claim is likely 
to be true. I would like to know before starting.

Original comment by millerlyte87@gmail.com on 13 Sep 2010 at 12:48

GoogleCodeExporter commented 9 years ago

Original comment by millerlyte87@gmail.com on 13 Sep 2010 at 1:16

GoogleCodeExporter commented 9 years ago
I think that push_back on the std::string shouldn't be an issue.  I'm not 100% 
sure but I think gcc uses small-string-optimization, which makes strings under 
say 24 characters super cheap.

I'll reprofile trucov (I'm too busy at the moment), but from what I can 
remember the majority of the time  (~12%) is spent allocating memory for 
std::string's copy constructor.

Original comment by j.nick.terry@gmail.com on 15 Sep 2010 at 3:01