bobfang1992 / pytomlpp

A python wrapper for tomlplusplus
https://bobfang1992.github.io/pytomlpp/pytomlpp.html
MIT License
86 stars 11 forks source link

Performance advice #32

Closed bobfang1992 closed 4 years ago

bobfang1992 commented 4 years ago

Well if you have to do the profiling python-side I'm not sure how you could make it less intrusive, really. What you have here is going to be negligible when contrasted with the cost of the pybind11 machinery anyways, so I wouldn't worry about that too much.

RE performance of the conversion code itself, there's a few things I'd look at:

Copying objects during iteration

https://github.com/bobfang1992/pytomlpp/blob/fb56a985b5c720d6a2422fa4f54e0abcb792a13e/include/pytomlpp/pytomlpp.hpp#L158

What is the type of it here? Is it a trivial type like a pointer, or some object? If it's a non-trivial object, this is doing a copy on every iteration, which I imagine might have some impact given pybind's reference-counting semantics etc. Consider for (auto& it : list) instead.

Copying key-value pairs during iteration

https://github.com/bobfang1992/pytomlpp/blob/fb56a985b5c720d6a2422fa4f54e0abcb792a13e/include/pytomlpp/pytomlpp.hpp#L207-L209

Same as above, only this time you're not only potentially copying the KVP object itself, but the key and value objects themselves, and then explicitly copying them again immediately after to give them names. Given that this is C++17 you can actually solve that problem and clean up the syntax by using a structured binding instead, i.e. for (auto& [key, value] : object)

Copying when moving is an option

https://github.com/bobfang1992/pytomlpp/blob/fb56a985b5c720d6a2422fa4f54e0abcb792a13e/include/pytomlpp/pytomlpp.hpp#L104-L107

The code creates a const string by copying from the toml data, then copies again in the assignment. You could eliminate one of those copies by doing a std::move:

      const toml::value<std::string> *string_value = value->as_string();
      std::string string_v = string_value->get(); //non-const
      result[key] = std::move(string_v);

(if you made the function take the input as toml::table&& you could even skip the copy entirely and only do a move from the source table, though I don't know how/where toml_table_to_py_dict is called to comment on if that's an option for you)

Use visit and/or switch

More generally, you can restructure those functions to use a switch on the value type instead of calling value->type() repeatedly in an if/else, or even simpler, refactor it to use toml::node::visit() instead, which handles all the switch machinery for you internally (example here).

Originally posted by @marzer in https://github.com/bobfang1992/pytomlpp/pull/31#issuecomment-649464135

marzer commented 4 years ago

If I get some time I'd consider doing some of this for you, if you like. I don't know much about the pybind/pip workflow, though... How do you test during iteration? Do you just repeatedly invoke pip install . in between running scripts that import pytomlpp?

bobfang1992 commented 4 years ago

Hi, sorry I was a bit busy (just started a new job). Yeah essentially that's what I do. I normally do pip install --force . and then run pytest. @EpicWink what's your workflow?

EpicWink commented 4 years ago

For the C extension modules, that will need to be compiled each time. However for the Python code, if you install the package as editable (using pip install -e .), the updated code will automatically be used the next time you import the package in Python.

As PyTomlPP builds the package using setuptools, the easiest way to build (and re-build) the extension modules is to call python setup.py build_ext --inplace (only if you install the package as editable).

In all cases, I'd recommend installing into a virtual environment using virtualenv. You can create a new virtual environment and install the package normally (ie without -e) for clean tests, but I prefer to let CI do all the clean testing (via PRs or enabling CI on your current branch).

After that, yeah running pytest is straightforward. I usually enable full verbosity -vv, all reporting -ra, and report the slowest 5 tests --durations=5; these can be added to your environment configuration.

marzer commented 4 years ago

So, if I understand you correctly @EpicWink, it would look like this:

Before I begin:

pip install -e .

After I make changes in C++:

python setup.py build_ext --inplace

After I make changes in python:

(nothing explicit;
simply running a script with 'import pytomlpp` will automatically use the new code)
EpicWink commented 4 years ago

@marzer that's right

marzer commented 4 years ago

Going to have a play around with this today.

bobfang1992 commented 4 years ago

Thanks!