pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.1k stars 2.05k forks source link

fix: fully qualify usages of concat to protect against ADL #4955

Closed volkm closed 3 months ago

volkm commented 7 months ago

Description

Replaced calls to concat by using complete namespace pybind11::detail::concat in file cast.h.

We encountered issues in our library, because the concat function from modernjson was erroneously used instead of the correct one from pybind11:

~/stormpy/stormpy/build/temp.macosx-14-arm64-cpython-311/_deps/pybind11-src/include/pybind11/cast.h:1402:39: note: non-constexpr function 'concat<std::string, pybind11::detail::descr<3, nlohmann::basic_json<>>>' cannot be used in a constant expression
    static constexpr auto arg_names = concat(type_descr(make_caster<Args>::name)...);
                                      ^
~/storm/resources/3rdparty/modernjson/include/nlohmann/detail/string_concat.hpp:137:22: note: declared here
inline OutStringType concat(Args && ... args)

Adding the namespace in pybind fixed this issue.

See this issue for details.

Suggested changelog entry:

* Qualify ``py::detail::concat`` usage to avoid ADL selecting one from somewhere else, such as modernjson's concat.
rwgk commented 7 months ago

That looks like an easy fix, but

grep -r 'concat('
pybind11/cast.h:        = const_name("tuple[") + concat(make_caster<Ts>::name...) + const_name("]");
pybind11/cast.h:    static constexpr auto arg_names = concat(type_descr(make_caster<Args>::name)...);
pybind11/numpy.h:        concat(const_name<N>(), array_info<T>::extents), const_name<N>());
pybind11/eigen/tensor.h:        static constexpr auto value = concat(const_name(((void) Is, "?"))...);
pybind11/eigen/tensor.h:    static constexpr auto dimensions_descriptor = concat(const_name<Indices>()...);
pybind11/functional.h:                         const_name("Callable[[") + concat(make_caster<Args>::name...)
pybind11/stl.h:                         const_name("Union[") + detail::concat(make_caster<Ts>::name...)
pybind11/detail/descr.h:constexpr descr<0> concat() { return {}; }
pybind11/detail/descr.h:constexpr descr<N, Ts...> concat(const descr<N, Ts...> &descr) {
pybind11/detail/descr.h:constexpr auto concat(const descr<N, Ts...> &d, const Args &...args) {
pybind11/detail/descr.h:constexpr auto concat(const descr<N, Ts...> &d, const Args &...args)
pybind11/detail/descr.h:    -> decltype(std::declval<descr<N + 2, Ts...>>() + concat(args...)) {
pybind11/detail/descr.h:    return d + const_name(", ") + concat(args...);
pybind11/typing.h:        = const_name("tuple[") + concat(make_caster<Types>::name...) + const_name("]");
pybind11/typing.h:    static constexpr auto name = const_name("Callable[[") + concat(make_caster<Args>::name...)

I believe depending on what is used, any of those could run into a conflict with modernjson. Which leads me to think we should insert the pybind11::detail:: prefix everywhere.

But

I also briefly looked at the modernjson code:

https://github.com/nlohmann/json/blob/9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03/include/nlohmann/detail/string_concat.hpp#L141

I see all but one concat_into() there are guarded already by enable_if. It looks likely that it is feasible to also guard concat() there with enable_if, which would be more environmentally friendly in general. Was that discussed already?

volkm commented 7 months ago

I wanted to be sure that this is actually something worthwhile to add before making the changes everywhere. I can of course add the prefix to all occurrences of concat.

I did not discuss this issue yet in the modernjson project. It could indeed be an option to add the enable_if for concat in modernjson instead.

rwgk commented 7 months ago

I did not discuss this issue yet in the modernjson project. It could indeed be an option to add the enable_if for concat in modernjson instead.

I think that would be best.

I'd be OK to add the prefixes here, too, but what is likely to happen over time is that new code is added here without the prefix, and then there are probably hundreds of custom type_casters all over the world which potentially run into the same issue (although I don't know how widespread the use of modernjason is). Solving the problem at the root would avoid that.

henryiii commented 7 months ago

Normally functions in different namespaces (pybind11::detail and (nlohmann::detail) don't collide. I believe cast is special, because users implement casters by entering our private namespace. Why is it getting mixed with nlohman's detail namespace, though? That shouldn't be visible. https://github.com/pybind/pybind11_json hasn't had any issues AFAIK.

Also, I'd fully qualify them, by the way, with ::pybind1::detail.

henryiii commented 7 months ago

Do you have a little more of the error message?

rwgk commented 7 months ago

Also, I'd fully qualify them, by the way, with ::pybind1::detail.

+1

(I just looked, we have many ::pybind11 already.)

Why is it getting mixed with nlohman's detail namespace, though?

Good question. Initially I thought it's probably through argument dependent lookup, but at second look, that doesn't really make sense.

rwgk commented 3 months ago

@volkm This is an OK change but also very adhoc. Is this actually still needed? Did you get to understand these two questions?

From me (Nov 29):

I'd be OK to add the prefixes here, too, but what is likely to happen over time is that new code is added here without the prefix, and then there are probably hundreds of custom type_casters all over the world which potentially run into the same issue (although I don't know how widespread the use of modernjason is). Solving the problem at the root would avoid that.

From @henryiii (also Nov 29):

Why is it getting mixed with nlohman's detail namespace, though?

henryiii commented 3 months ago

I don't think this hurts, it's just general cleanup that happens to also (maybe) help someone? We should be qualifying our usage.

rwgk commented 3 months ago

I don't think this hurts, it's just general cleanup that happens to also (maybe) help someone? We should be qualifying our usage.

By that token we'd have to qualify everything.

I think it's better to not merge this PR unless

henryiii commented 3 months ago

We already qualify some other things. If it makes sense to qualify it, why not go ahead and qualify it? We don't call some random concat (as in the description), but only ours?

henryiii commented 3 months ago

(I'd also be okay with more qualified names too, unqualified names really should only be used when it's fine for a user to inject their own code, and we don't support that with concat)

rwgk commented 3 months ago

We already qualify some other things. If it makes sense to qualify it, why not go ahead and qualify it? We don't call some random concat (as in the description), but only ours?

My thinking: Fully qualifying only makes sense in macros. Or if there is a well-understood name lookup issue. Other than that I look at pybind11::detail:: as unwanted pollution (reduces readability), negating some of the benefits that namespace was invented for.

Also: Either do this systematically, or not at all. Making a point change for one random snowflake seems odd, especially since we don't even know if it's actually still needed, or why.

henryiii commented 3 months ago

Okay, I'll give up until more can be understood. @volkm is patching pybind11 at https://github.com/moves-rwth/stormpy/tree/6c983bc9395646ccbe04cbf7238d6f5aeb922946/resources, so I'd say the answer to "is it needed" is yes.

Personally, I'd be fine to add ::pybind11::detail:: on an as needed / best attempt basis to help avoid users applying patches to pybind11; adding it everywhere would provide a lot of pollution and be difficult to get right, but it's fine in a few places, and I expect it's just needed where type casing is involved. I suspect it's our fault, since we allow users to explicitly add to our pybind11::detail namespace though custom type casters. I bet this concat is leaking there, it's a common name.

henryiii commented 3 months ago

@volkm I really don't see how this could be leaking, and I'd really like to know how it does leak. Do you have the full error message, not just the snipped part with pybind11 in it? Is there a simple way to build stormpy / storm?

henryiii commented 3 months ago

This would also fix https://github.com/pybind/pybind11/issues/5072, FYI.

rwgk commented 3 months ago

This would also fix #5072, FYI.

Oh wow, cool! I cannot do this right now (maybe tomorrow if nobody gets to it before): add the #5072 test here. Then it'll LGTM.

henryiii commented 3 months ago
FAILED: tests/CMakeFiles/pybind11_tests.dir/test_custom_type_casters.cpp.o
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -DPYBIND11_TEST_BOOST -DPYBIND11_TEST_EIGEN -Dpybind11_tests_EXPORTS -I/Users/henryschreiner/git/software/pybind11/include -isystem /usr/local/opt/python@3.11/Frameworks/Python.framework/Versions/3.11/include/python3.11 -isystem /Users/henryschreiner/git/software/pybind11/.nox/tests-3-11/tmp/_deps/eigen-src -isystem /usr/local/include -Os -DNDEBUG -std=gnu++11 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.4.sdk -mmacosx-version-min=14.2 -fPIC -fvisibility=hidden -Wall -Wextra -Wconversion -Wcast-qual -Wdeprecated -Wundef -Wnon-virtual-dtor -Werror -flto -MD -MT tests/CMakeFiles/pybind11_tests.dir/test_custom_type_casters.cpp.o -MF tests/CMakeFiles/pybind11_tests.dir/test_custom_type_casters.cpp.o.d -o tests/CMakeFiles/pybind11_tests.dir/test_custom_type_casters.cpp.o -c /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp
In file included from /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:10:
In file included from /Users/henryschreiner/git/software/pybind11/tests/constructor_stats.h:68:
In file included from /Users/henryschreiner/git/software/pybind11/tests/pybind11_tests.h:3:
In file included from /Users/henryschreiner/git/software/pybind11/include/pybind11/eval.h:14:
In file included from /Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:13:
In file included from /Users/henryschreiner/git/software/pybind11/include/pybind11/detail/class.h:12:
In file included from /Users/henryschreiner/git/software/pybind11/include/pybind11/detail/../attr.h:14:
/Users/henryschreiner/git/software/pybind11/include/pybind11/cast.h:1477:27: error: constexpr variable cannot have non-literal type 'std::string const'
    static constexpr auto arg_names = concat(type_descr(make_caster<Args>::name)...);
                          ^
/Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:263:34: note: in instantiation of template class 'pybind11::detail::argument_loader<const ADL_issue::test &>' requested here
                sizeof...(Args), cast_in::args_pos >= 0, cast_in::has_kwargs),
                                 ^
/Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:145:9: note: in instantiation of function template specialization 'pybind11::cpp_function::initialize<(lambda at /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:25), void, const ADL_issue::test &, pybind11::name, pybind11::scope, pybind11::sibling>' requested here
        initialize(
        ^
/Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:1213:22: note: in instantiation of function template specialization 'pybind11::cpp_function::cpp_function<(lambda at /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:25), pybind11::name, pybind11::scope, pybind11::sibling, void>' requested here
        cpp_function func(std::forward<Func>(f),
                     ^
/Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:7: note: in instantiation of function template specialization 'pybind11::module_::def<(lambda at /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:25)>' requested here
    m.def("_adl_issue", [](const ADL_issue::test&){});
      ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/string:704:7: note: 'basic_string<char>' is not literal because it is not an aggregate and has no constexpr constructors other than copy or move constructors
class basic_string
      ^
In file included from /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:10:
In file included from /Users/henryschreiner/git/software/pybind11/tests/constructor_stats.h:68:
In file included from /Users/henryschreiner/git/software/pybind11/tests/pybind11_tests.h:3:
In file included from /Users/henryschreiner/git/software/pybind11/include/pybind11/eval.h:14:
/Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:263:41: error: incomplete definition of type 'pybind11::detail::argument_loader<const ADL_issue::test &>'
                sizeof...(Args), cast_in::args_pos >= 0, cast_in::has_kwargs),
                                 ~~~~~~~^~
/Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:145:9: note: in instantiation of function template specialization 'pybind11::cpp_function::initialize<(lambda at /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:25), void, const ADL_issue::test &, pybind11::name, pybind11::scope, pybind11::sibling>' requested here
        initialize(
        ^
/Users/henryschreiner/git/software/pybind11/include/pybind11/pybind11.h:1213:22: note: in instantiation of function template specialization 'pybind11::cpp_function::cpp_function<(lambda at /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:25), pybind11::name, pybind11::scope, pybind11::sibling, void>' requested here
        cpp_function func(std::forward<Func>(f),
                     ^
/Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:7: note: in instantiation of function template specialization 'pybind11::module_::def<(lambda at /Users/henryschreiner/git/software/pybind11/tests/test_custom_type_casters.cpp:225:25)>' requested here
    m.def("_adl_issue", [](const ADL_issue::test&){});
      ^
2 errors generated.

And passes with the patch. :)

rwgk commented 3 months ago

I wonder if there is a more robust and general solution, so that we don't have to remember to add prefixes when writing new code. I found this:

https://indii.org/blog/avoiding-argument-dependent-lookup-in-cpp/

I guess the "culprit type" in our case is pybind11::detail::descr<N, Ts...>. But we want our concat to be found (a::f in that article), not the other (b::f in the article). Can we bend that trick to do the right thing for us?

henryiii commented 3 months ago

When you know the namespace and don't want ADL, the general solution is to fully qualify it. Tricks are out there for the other direction, but if you know the namespace, just qualify it. It's simpler than tricks and easy on the compiler.

rwgk commented 3 months ago

What I was hoping for is a setup that doesn't leave us playing a whack-a-mole game (new code missing the prefix).

But I'm fine if we just get to a consistent state at this moment, by systematically adding the prefix for all concat(make_caster<>::name). It looks pretty easy, based on what I posted under https://github.com/pybind/pybind11/pull/4955#pullrequestreview-1963532758:

functional.h, stl.h, typing.h

henryiii commented 3 months ago

I'm okay to add those, though I wasn't able to get them to trigger easily. I suspect a few other of our functions are susceptible, but concat just happens to be a pretty common name. Ideally I think we should fully qualify wherever ADL might cause a lookup we know we don't want, but that's not something to work on now. I'm not sure if there is or even could be a static check for this, as ADL is useful or even required in many cases (like std::cout << "hi", etc). If there was some way to annotate a function so that static checkers could tell you never wanted to replace it via ADL, perhaps.

rwgk commented 3 months ago

I suspect a few other of our functions are susceptible

I think so, too, but there is only so much time in a day. :-/

I wasn't able to get them to trigger easily.

Yeah... I was afraid that could be the case.

On balance, pollution vs precaution, systematically prefixing concat(make_caster<>::name) seems like a good tradeoff to me: probably helpful, just a few places, doesn't take a lot of our time.

henryiii commented 3 months ago

Ah, the reason the tuple[] one can't trigger is that it's doing ADL on the argument, which is std::tuple. So if std::concat was ever added, that would break. In fact, looks like there's an std::experimental::concat: https://en.cppreference.com/w/cpp/experimental/simd/concat

My test code for reference:

m.def("_adl_issue_tuple", [](const std::tuple<const ADL_issue::test> &) {});