GothicKit / ZenKit

A re-implementation of file formats used by the early 2000's ZenGin
http://zk.gothickit.dev/
MIT License
44 stars 10 forks source link

Dialogs parsing performance #28

Closed Try closed 1 year ago

Try commented 1 year ago

GameScript::loadDialogOU() become noticeably slower with phenix backend.

I haven't done proper profiling, but issue seem to be rooted in large amount of small allocations, while population hash-map. Lazy profiling(debug break a few times) shows, that this place can be an issue:

    bool archive_reader_binsafe::read_object_begin(archive_object& obj) {
        ...
        std::stringstream ss {line.substr(1, line.size() - 2)}; // substr creates a copy; and well stringstream known to be slow
        ss >> obj.object_name >> obj.class_name >> obj.version >> obj.index;
        return true;
    }
lmichaelis commented 1 year ago

Hm, that would mean that you should have observed slower load times everywhere, especially when loading the VOb tree. There might be optimisations that can be done in messages though.

Try commented 1 year ago

I've made a few experiment locally:

  1. removing std::stringstream - generally nice, but not critical improvement.

  2. replacing messages::_m_blocks_by_name to sort + binary search - work quite well

    
    // messages messages::parse(buffer& buf)
    ...
    std::sort(msgs.blocks.begin(), msgs.blocks.end(), [](const message_block& l, const message_block& r){
      return l.name<r.name;
      });
    return msgs;
    }

const message_block messages::block_by_name(std::string_view name) const { auto it = std::lower_bound(blocks.begin(), blocks.end(), name, [](const message_block& l, std::string_view name) { return l.name < name; }); if (it!=blocks.end()) return &it; return nullptr; }

lmichaelis commented 1 year ago

It certainly makes sense that not allocating potentially thousands of small std::strings would make loading faster. Are you sure that in-game performance is actually acceptable using binary search instead of std::unordered_map's O(1) access times?

Try commented 1 year ago

Yes, I've tested this solution locally. Also fetching block_by_name relatively rare case (roughly a 1 call per 3 seconds). Worst what can happen is spike at search call.

lmichaelis commented 1 year ago

Implemented :)