majintao0131 / yaml-cpp

Automatically exported from code.google.com/p/yaml-cpp
MIT License
0 stars 0 forks source link

Map insertion has O(n^2) performance #174

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Consider the following program:
main()
{
  std::map<int, int> values;
  for(size_t i = 0; i < 8192; i++)
    values[i] = i;
  YAML::Node n;
  n = values;
}
On my machine, this takes roughly 40 seconds of execution time, which is not 
awesome. Granted, yaml may not be the best format for representing this data 
structure, but I don't think 8192 values is so large to be unreasonable.

This is due to the existence check in node::detail::impl.h, in the function  
template<typename Key> inline node& node_data::get(const Key& key, 
shared_memory_holder pMemory) lines 95 - 98 which loops though the entire map 
for each insertion, leading to O(N^2) performance.

Could this be done smarter? Especially in this case, where you are replacing 
the entire node (so you know there should be no value that already exists)

Original issue reported on code.google.com by abels...@gmail.com on 31 Oct 2012 at 5:24

GoogleCodeExporter commented 9 years ago
Interesting, I hadn't thought of this. We could add a Node::force_insert(K, V) 
which bypasses the check.

Original comment by jbe...@gmail.com on 31 Oct 2012 at 8:49

GoogleCodeExporter commented 9 years ago
Done, r880dee1aa3da.

I ran your example, and confirmed the ~40s initially, and now it runs virtually 
instantaneously (I didn't time it).

Original comment by jbe...@gmail.com on 1 Nov 2012 at 12:12

GoogleCodeExporter commented 9 years ago
Awesome! Thank you very much for a great library.

Original comment by abels...@gmail.com on 1 Nov 2012 at 8:13