wlav / CPyCppyy

Other
23 stars 20 forks source link

Carry a copy of the string also in the `std::string_view` converter and consider statefulness in `InitializerListConverter` #14

Closed guitargeek closed 6 months ago

guitargeek commented 7 months ago

There are two possible fixes to the problem with string lifetimes and std::string_view arguments:

Copying the string is supposedly faster on average, at least in the ROOT usecase the strings that are passed around are not very long (RDataFrame filter and variable definitions).

Also, the InitializerListConverter is changed such that it creates separate converters for each element if the converters have state.

This PR fixes the following reproducer:

If they have state, a separate converter needs to be created for each
element in the initializer list.

Fixes the following reproducer:

```python
import cppyy

cppyy.cppdef("""
void foo(std::string_view s1, std::string_view s2)
{
   std::cout << s1 << std::endl;
   std::cout << s2 << std::endl;
   std::cout << std::endl;
}

void bar(std::initializer_list<std::string_view> sl)
{
   for (auto const& s : sl) {
      std::cout << s << std::endl;
   }
   std::cout << std::endl;
}
""")

cppyy.gbl.foo("hello", "world")
cppyy.gbl.bar(["hello", "world"])

Closes #13.

guitargeek commented 7 months ago

The same patch is also now part of my syncing PR in ROOT: https://github.com/root-project/root/pull/14507 If the current CI run over there fixes these excessive failures in the DataFrame tutorials, it means it worked.

edit: indeed, the RDataFrame tutorials pass now

guitargeek commented 7 months ago

Also: what is the most appropriate place for a unit test? I guess somewhere here? https://github.com/wlav/cppyy/tree/master/test

wlav commented 5 months ago

This one actually breaks two existing string_view tests, so in the future, it would be useful to run the standard cppyy tests and not just the ROOT ones, which have clearly far less coverage.

The problem is that in SetArg, the assumption is made that the pointed to voidp is an std::string*, but by changing the buffer type in the declaration to std::string while leaving the scope id to std::string_view in the implementation, it can be either std::string* or std::string_view* and no way out to tell which is which. When wrong, then crash.

guitargeek commented 5 months ago

Thank you very much for following up! Sorry, I didn't think of this case. It's more common to use std::string_view as argument types and not pass them around directly, that's why I didn't catch this