Closed Try closed 1 year ago
Generally, I wouldn't be opposed to using sscanf
since it makes really easy to understand what's being parsed and a 40% performance improvement, especially in read_object_begin
would be very impressive. We would not be able to use sscanf_s
, though, since that's non-standard. Something like
char class_name[128];
char object_name[128];
sscanf(line.c_str(), "[%127s %127s %hd %d]", object_name, class_name, &obj.version, &obj.index);
could work, however. I'll test that out.
So I've done some benchmarking. I can't quite reproduce your 40% performance improvement. Maybe it's an issue with one of the standard libraries. I am still getting about 10-15% better performance so it's probably a good idea to implement it anyways but I'm kinda confused about this difference. Here are some of my benchmark results (sadly I don't have a proper benchmark suite yet so I can't really provide you with an easy way of reproducing this):
Compiler | File | Mode | Iterations | Min (ms) | Max (ms) | Avg (ms) | Cumulative (s) | Approx. Avg. Difference |
---|---|---|---|---|---|---|---|---|
Clang | ||||||||
Clang 14.0.6 | NEWWORLD.ZEN | Debug | 10 | 5944 | 6067 | 5982 | 59 | 1% (67ms) |
Clang 14.0.6 | NEWWORLD.ZEN | Release | 100 | 222 | 293 | 227 | 22 | 18% (51ms) |
Clang 14.0.6 | OU.BIN | Debug | 50 | 796 | 824 | 802 | 40 | 5% (46ms) |
Clang 14.0.6 | OU.BIN | Release | 200 | 47 | 55 | 48 | 9 | 29% (20ms) |
Clang 14.0.6 | OU.CSL | Debug | 10 | 3697 | 3733 | 3707 | 37 | 11% (498ms) |
Clang 14.0.6 | OU.CSL | Release | 200 | 167 | 183 | 175 | 35 | 15% (33ms) |
GCC | ||||||||
GCC 12.2.0 | NEWWORLD.ZEN | Debug | 10 | 4162 | 4260 | 4189 | 41 | 3% (141ms) |
GCC 12.2.0 | NEWWORLD.ZEN | Release | 100 | 209 | 236 | 210 | 21 | 18% (49ms) |
GCC 12.2.0 | OU.BIN | Debug | 50 | 657 | 753 | 670 | 33 | 6% (50ms) |
GCC 12.2.0 | OU.BIN | Release | 200 | 43 | 48 | 43 | 8 | 34% (23ms) |
GCC 12.2.0 | OU.CSL | Debug | 10 | 2877 | 2938 | 2888 | 28 | 7% (224ms) |
GCC 12.2.0 | OU.CSL | Release | 200 | 143 | 194 | 154 | 30 | 15% (28ms) |
Compiler | File | Mode | Iterations | Min (ms) | Max (ms) | Avg (ms) | Cumulative (s) |
---|---|---|---|---|---|---|---|
Clang | |||||||
Clang 14.0.6 | NEWWORLD.ZEN | Debug | 10 | 6028 | 6157 | 6049 | 60 |
Clang 14.0.6 | NEWWORLD.ZEN | Release | 100 | 248 | 278 | 259 | 25 |
Clang 14.0.6 | OU.BIN | Debug | 50 | 843 | 867 | 848 | 42 |
Clang 14.0.6 | OU.BIN | Release | 200 | 68 | 73 | 68 | 13 |
Clang 14.0.6 | OU.CSL | Debug | 10 | 4167 | 4384 | 4205 | 42 |
Clang 14.0.6 | OU.CSL | Release | 200 | 198 | 221 | 208 | 41 |
GCC | |||||||
GCC 12.2.0 | NEWWORLD.ZEN | Debug | 10 | 4309 | 4397 | 4330 | 43 |
GCC 12.2.0 | NEWWORLD.ZEN | Release | 100 | 232 | 258 | 233 | 23 |
GCC 12.2.0 | OU.BIN | Debug | 50 | 710 | 779 | 720 | 36 |
GCC 12.2.0 | OU.BIN | Release | 200 | 65 | 71 | 66 | 13 |
GCC 12.2.0 | OU.CSL | Debug | 10 | 3075 | 3176 | 3112 | 31 |
GCC 12.2.0 | OU.CSL | Release | 200 | 174 | 197 | 182 | 36 |
This are the changes I've made (I've applied the fix to both archive_reader_ascii
as well as archive_reader_binsafe
):
Index: source/archive/archive_ascii.cc
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/source/archive/archive_ascii.cc b/source/archive/archive_ascii.cc
--- a/source/archive/archive_ascii.cc (revision 21dbecd6dc6890510a8218d0aa0868b6cbdb0b14)
+++ b/source/archive/archive_ascii.cc (date 1668629105666)
@@ -27,15 +27,21 @@
return false;
input.mark();
+
auto line = input.get_line();
+ char class_name[128];
+ char object_name[128];
- if (!line.starts_with('[') || !line.ends_with(']') || line.length() <= 2) {
+ auto parsed_elements =
+ std::sscanf(line.c_str(), "[%127s %127s %hu %u]", object_name, class_name, &obj.version, &obj.index);
+
+ if (parsed_elements != 4) {
input.reset();
return false;
}
- std::stringstream ss {line.substr(1, line.size() - 2)};
- ss >> obj.object_name >> obj.class_name >> obj.version >> obj.index;
+ obj.object_name = object_name;
+ obj.class_name = class_name;
return true;
}
Index: source/archive/archive_binsafe.cc
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/source/archive/archive_binsafe.cc b/source/archive/archive_binsafe.cc
--- a/source/archive/archive_binsafe.cc (revision 21dbecd6dc6890510a8218d0aa0868b6cbdb0b14)
+++ b/source/archive/archive_binsafe.cc (date 1668629090506)
@@ -41,13 +41,19 @@
}
auto line = input.get_string(input.get_ushort());
- if (!line.starts_with('[') || !line.ends_with(']') || line.length() <= 2) {
+ char class_name[128];
+ char object_name[128];
+
+ auto parsed_elements =
+ std::sscanf(line.c_str(), "[%127s %127s %hu %u]", object_name, class_name, &obj.version, &obj.index);
+
+ if (parsed_elements != 4) {
input.reset();
return false;
}
- std::stringstream ss {line.substr(1, line.size() - 2)};
- ss >> obj.object_name >> obj.class_name >> obj.version >> obj.index;
+ obj.object_name = object_name;
+ obj.class_name = class_name;
return true;
}
I can't quite reproduce your 40% performance improvement.
Windows10, mingw, debug. Also running stuff on laptop definitely not helping Today I've attempted to measure again - performance is all over the place unfortunately(sometimes better, sometimes worse).
While on that I have a question about read_string
design:
If you have buffer with mmap
or real memory behind the hood, would it make more sense to return a string_view
from buffer::get_string
? That might be helpful for code that doesn't store string, but parses it instead.
PS: Reason why debug performance of load-time has value to me, is to iterate quicker on a game development.
If you have buffer with mmap or real memory behind the hood, would it make more sense to return a string_view from buffer::get_string?
The idea behind buffer
is, that you can supply your own backend (ie. a std::ifstream
). This design necessitates copying data instead of just referring to it through a pointer. It just happens to be the case that by default, only heap
and mmap
backings are available:
In addition to that, copying the data into a std::string
makes it much safer to use and you can't accidentally refer to unmapped memory after the buffer deallocates. Before thinking about doing something like this, I'd like to explore other areas to possibly improve on.
Reason why debug performance of load-time has value to me, is to iterate quicker on a game development.
I agree. Performance is important. I'd like to avoid sacrificing memory safety for performance wherever possible though.
Followup to: https://github.com/lmichaelis/phoenix/issues/28 I've tested
std::stringstream
performance today. Change:Benchmark (world = oldworld.zen)
While change is rough one, and hopefully there is a better way than
sscanf
, as prototype looking quite good: