certik / yaml-cpp

Automatically exported from code.google.com/p/yaml-cpp
MIT License
0 stars 0 forks source link

RFE: Optimization for char* #177

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
A lot of the uses the API use const char* strings, stemming from eg literals. 
It is unfortunate that these are typically wrapped in std::string simply for 
name lookup! For games using a lot of YAML, this can affect start-up time. It 
would be very nice if an optimization existed that special cased const char* 
based lookups, that avoided creating an std::string every-time.

Great library though :)

Original issue reported on code.google.com by domuradi...@gmail.com on 27 Nov 2012 at 3:45

GoogleCodeExporter commented 9 years ago
That's not a bad idea, but are you sure that this is the bottleneck? I'm 
guessing that there are two more likely culprits - (a) parsing time and (b) the 
fact that a map lookup is O(n) instead of O(log n).

Also, which version of the API are you using?

Original comment by jbe...@gmail.com on 27 Nov 2012 at 4:12

GoogleCodeExporter commented 9 years ago
Oh ? I wasn't aware the map lookups were O(n). How is that so?
And sorry I must admit I was talking out of conjecture :). 

The fact that I use exclusively const char* lookup and it always creates a 
temporary string just irked me a bit. 

The performance is currently good enough for my game lanarts , but I will 
eventually scale with a lot more content.

I am using version 0.3.0.

Original comment by domuradi...@gmail.com on 27 Nov 2012 at 1:58

GoogleCodeExporter commented 9 years ago
Since map keys can be any type in YAML, the only sure way to look up a key is 
by checking against every key in the map. However, if you're looking up a 
string key, we could have a shortcut hashmap, so it could even be O(1). I've 
been meaning to do this for a while, but haven't gotten around to it.

In general, creating a temporary string usually isn't a big deal, unless you're 
doing it a ton. If your string literal is less than 16 characters, then most 
compilers won't even allocate the new string on the heap, so it's relatively 
cheap.

Original comment by jbe...@gmail.com on 27 Nov 2012 at 2:13

GoogleCodeExporter commented 9 years ago
If you use GCC (as I do), then the strings are copy-on-write, and don't use the 
short-string optimization. (Although almost all compilers *will* be forced to 
drop COW for C++11, GCC is generally known for its use of allocated COW 
strings).

I had no clue keys could be any type in YAML :) A hashmap with all string keys 
+ a different representation for all non-string keys would be great (I would 
reckon 99% of cases would only ever hit the hashmap).

Thanks for the consideration.

Original comment by domuradi...@gmail.com on 27 Nov 2012 at 2:26

GoogleCodeExporter commented 9 years ago
Interesting, I didn't know that gcc still uses COW.

I think this issue does make sense, so I'll look into it. I also opened Issue 
178, regarding a secondary map for string keys, in case you want to follow that.

Original comment by jbe...@gmail.com on 28 Nov 2012 at 12:20