Open Flamefire opened 4 years ago
Even worse: Score-P copys the C-String again 😉 . However, I am not a fan of mixing C-String with C++ Strings. And sticking to C only would mean to implement a hashtable, what is even more problematic in terms of performance.
Moreover, if you are worried about the performance, #5 should be addressed first. Approx 85% of the code are written in python. Each function enter or exit causes a lot of python action, see https://github.com/score-p/scorep_binding_python/blob/f9fbd7d932f945cac2b679fcd24cc8771b40f211/scorep/instrumenters/scorep_profile.py#L76 ff.
I think I'll reconsider this issue once #5 is fixed. However, I'll leave it open in the meantime. If you think it is necessary to fix this earlier, you are welcome to do a PR respecting the above-stated concerns.
Best,
Andreas
However, I am not a fan of mixing C-String with C++ Strings. And sticking to C only would mean to implement a hashtable, what is even more problematic in terms of performance.
I see nothing wrong with using what the language offers. C-Strings are part of C++ and there are C++ standard library functions for dealing with them i.e. the entire header std::
, so "blessed") So using a std::string
although you actually need a C-Cstring just for the sake of good looks would be the wrong reasoning IMO.
I think I'll reconsider this issue once #5 is fixed. However, I'll leave it open in the meantime.
Sure. I've just seen this and wanted to document it so it doesn't get forgotten.
I see nothing wrong with using what the language offers. ...
The actual solution seems to be string_view:
https://en.cppreference.com/w/cpp/string/basic_string_view
together with a try_emplace
to a std::unorderd_map<string>
.
However, this would require C++17.
Best,
Andreas
Not really:
A typical implementation holds only two members: a pointer to constant CharT and a size.
So you'll at least pay for iterating over the string once to determine the size. So only keeping the C-String until you need a std::string
is the least wasteful solution.
Edit: Additionally there is no c_str()
to get back the C-String you need
Edit: Additionally there is no c_str() to get back the C-String you need
but a data()
but a data()
... which is not guaranteed to contain a NULL terminated string.
Which is never guaranteed with an array. But as it is only a view to a NUL terminated string, thats not an issue.
BTW more fun is this:
s# (str, read-only bytes-like object) [const char *, int or Py_ssize_t]
Like s*, except that it doesn’t accept mutable objects. The result is stored into two C variables, the first one a pointer to a C string, the second one its length. The string may contain embedded null > bytes. Unicode objects are converted to C strings using 'utf-8' encoding.
then you'll get the length for free. Moreover, it's more flexible.
Best, Andreas
Which is never guaranteed with an array.
Er what? A const char*
is (usually) a C-String which is a NULL-terminated sequence of chars and what is needed to pass to Score-P
But as it is only a view to a NUL terminated string, thats not an issue.
So you want to store your C-String in a struct storing a pointer and size, then pass that to a function, in that function expect the pointer to be a C-String and ignore the size using that struct effectively as a convoluted way of transferring a C-String.
Before this gets more spammy I'll stop. Important info is in first post and https://github.com/score-p/scorep_binding_python/issues/67#issuecomment-627164640.
The code for s#
else if (PyUnicode_Check(arg)) {
Py_ssize_t len;
sarg = PyUnicode_AsUTF8AndSize(arg, &len);
if (sarg == NULL)
return converterr(CONV_UNICODE,
arg, msgbuf, bufsize);
*p = sarg;
STORE_SIZE(len);
}
And the code for s
else if (PyUnicode_Check(arg)) {
sarg = PyUnicode_AsUTF8AndSize(arg, &len);
if (sarg == NULL)
return converterr(CONV_UNICODE,
arg, msgbuf, bufsize);
if (strlen(sarg) != (size_t)len) {
PyErr_SetString(PyExc_ValueError, "embedded null character");
RETURN_ERR_OCCURRED;
}
*p = sarg;
}
So s
does an additional comparison of the returned String to be sure, that there is no NUL character in UTF8 coded Strings. We should be able to avoid this at least for function and module names, as python does not allow NUL as part of Identifiers (see https://www.python.org/dev/peps/pep-3131/#specification-of-language-changes and https://docs.python.org/3.7/reference/lexical_analysis.html#identifiers )
The same applies to filenames on UNIX, (see https://unix.stackexchange.com/questions/38055/utf-8-filenames ). So we could avoid this as well.
My proposal would be to use a simple, slim wrapper around a C-String:
class CString{
CString(const char* s, size_t len); // Could be guarded by assert for tests
CString(const char[] s); // With template to deduce size from raw strings (trivial)
bool operator==(const CString&);
bool operator!=(const CString&);
size_t find(char) const;
const char* c_str() const;
size_t size() const;
}
This is very similar to string_view
but focuses on/guarantees NULL-terminated strings. The implementation is incredibly simple so verifiable by just looking at it. Nothing really happening there. And the comparison operators save call sites from using the (IMO strange) C string comparison functions. Some interop functions with std::string
could be added too as needed (e.g fast conversion, direct comparison, ...) but those are also very simple.
This then allows accepting the Python values with known size and passing them to functions (like region_begin) that currently take std::string
but only pass them as c_str()
down to Score-P and maybe compare them or take substrings.
If you agree with this design I'll create a PR with an implementation of this proposal
Sounds reasonable. Please go ahead.
Best,
Andreas
For some numbers: Replacing the std::string
s by const char*
where it is beneficial results in these numbers:
0.3920-0.4455, mean: 0.4115, median: 0.4073
0.9087-0.9889, mean: 0.9460, median: 0.9460
1.4548-1.5868, mean: 1.5129, median: 1.5108
0.3861-0.4372, mean: 0.4071, median: 0.4066
0.9111-1.0089, mean: 0.9551, median: 0.9547
1.4719-1.5940, mean: 1.5263, median: 1.5262
Top is with C-Strings for 100k, 300k and 500k iterations of the simplefunc benchmark (min, max, mean and median over 51 invocations) and bottom is current master using std::string
. All using the cProfile instrumenter
So there is some effect although not much. The small case is even slower for some reason but I think this is an artifact from the small runtime.
I'd expect the effect to be more visible when memory pressure is higher especially as the benchmark uses only 1 function hence the memory allocator might return always the last used chunk which is faster than "real-world" use cases of random length functions
After merging #105 you might want to try again, and use s#
. As the creation of region_names is only done, when necessary, this might save a lot of runtime.
As the tracer should keep its own impact as low as possible, keeping an eye out for unnecessary memory allocations is vital.
Example: https://github.com/score-p/scorep_binding_python/blob/f9fbd7d932f945cac2b679fcd24cc8771b40f211/src/scorep.cpp#L23 called from https://github.com/score-p/scorep_binding_python/blob/f9fbd7d932f945cac2b679fcd24cc8771b40f211/src/scorep.cpp#L167
module
andfile_name
are initially NULL-terminated C-Stringsstd::string
s very likely allocation heap memory and copying the contentsfile_name
is then never used as astd::string
but converted back to a C-String: https://github.com/score-p/scorep_binding_python/blob/f9fbd7d932f945cac2b679fcd24cc8771b40f211/src/scorep.cpp#L33 So all work is wasted completelymodule
only a substring (prefix) is used (BTW: usingsubstr
instead of the ctor would likely read easier) Using resize instead would avoid an allocation and copy and using the find and substr on the original C-String would make it even cheaperI'm aware this is a Python tracer so relative performance impact is not as large as for e.g. C++ programs but especially allocations can have a noticeable impact. So as un-C++ it is, I'd recommend using raw C-Strings for those 2 and similar for other functions