[BUG]: Unable to show stdout output of a binary module inside Jupyter notebook

Required prerequisites

[X] Make sure you've read the documentation. Your issue may be addressed there.
[X] Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
[X] Consider asking first in the Gitter chat room or in a Discussion.

What version (or hash if on master) of pybind11 are you using?

2.10.0

Problem description

We wrote a C/CUDA based optics simulator, mcx (https://github.com/fangq/mcx), and compiled it to a binary python module, pmcx (https://pypi.org/project/pmcx/) using pybind11 as the interface.

Recently, when I prepared some tutorials using Jupyter notebook on Google colab, I found that the printf() statements inside my C modules can not be printed inside Jupyter notebook. Only the std::cout statements inside the python wrappers can be printed. Both stdout/std::cout messages can be seen in a command line window.

I am wondering if there is way to let pybind redirect stdout from the C code to Jupyter notebook?

I saw #1005 and #1009 discussed something along those lines, they solved the std::cout redirect, but I feel that perhaps its support to Jupyter notebook is still incomplete?

if this is not a bug of pybind, is there anything I need to add to my C units to allow this to be printed? our interfaces already called py::call_guard<py::scoped_ostream_redirect, py::scoped_estream_redirect>(). not sure what else is missing.

Reproducible example code

we have a CUDA version that can only run on NVIDIA GPUs. For easy testing, here is an OpenCL version (very similar to the CUDA version), called pmcxcl (https://pypi.org/project/pmcxcl/) - as long as you have a reasonably new graphics driver, you should have OpenCL support - on Linux, you can always install pocl to support CPU based OpenCL

to install the module, you can run

python3 -m pip install numpy pmcxcl pmcx

you can run the below code inside a terminal (all messages are printed), or inside a Jupyter notebook (only std:cout are shown)

import pmcx
pmcx.gpuinfo()

in a terminal window, I can see the full log

>>> import pmcx
>>> pmcx.gpuinfo()
+ =============================   GPU Information  ================================
+ Device 1 of 1:        NVIDIA GeForce RTX 4090
+ Compute Capability:   8.9
+ Global Memory:        25385107456 B
+ Constant Memory:  65536 B
+ Shared Memory:        49152 B
+ Registers:        65536
+ Clock Speed:      2.54 GHz
+ Number of SMs:        128
+ Number of Cores:  8192
+ Auto-thread:      524288
+ Auto-block:       64
[{'name': 'NVIDIA GeForce RTX 4090', 'id': 1, 'devcount': 1, 'major': 8, 'minor': 9, 'globalmem': 25385107456, 'constmem': 65536, 'sharedmem': 49152, 'regcount': 65536, 'clock': 2535000, 'sm': 128, 'core': 8192, 'autoblock': 64, 'autothread': 524288, 'maxgate': 0}]

however, inside a Jupyter notebook (you can create one from Google colab and set the backend to run on a T4 GPU), you can not see the top part of the output (shown in green), which was printed by fprintf(stdout ...) in the C code.

Is this a regression? Put the last known working version here if it is.

Not a regression

pybind / pybind11