Closed darksylinc closed 8 months ago
This is my fault, just for reference, I never expected it to be used with zero. I eventually put the code in:
RenderingDevice::compute_list_dispatch_threads
And it validates zero arguments as error, but it seems I left early pieces of the code floating around that were not moved to use this function and forgot about them.
PR fixing this would be very welcome!
I have some time to spare right now, I can take care of it if that makes everyone happy
I have some time to spare right now, I can take care of it if that makes everyone happy
Fine by me.
@lawnjelly has suggested to use a function which I agree it is much more obvious what the code is trying to do (rather than hope the reader is familiar with the formula):
#define GODOT_ROUND_UP(N, S) ((((N) + (S) - 1) / (S)) * (S))
My only remark to that is that perhaps a function would be better, because a macro would accept floats with no warnings and it won't produce the right result (since it's meant for unsigned/signed integers).
I assume this function would be added somewhere in math_funcs.h?
How should this function be named? divide_round_up(int numerator, int denominator)
?
Yeah that does seem reasonable
In floating point math this operation is usually known as ceil after division, but I personally prefer round_up
Whenever the code wants to round up, it performs x = (x - 1) / y + 1 when it should be doing x = (x + y - 1) / y
Wont x = (x + y - 1) / y
cause overflow when for example both x
and y
are somewhere between max_int / 2
and max_int
?
Wont x = (x + y - 1) / y cause overflow when for example both x and y are somewhere between max_int / 2 and max_int?
Yes, but this is an implementation detail better discussed on the PR. Integer math is nearly always susceptible to overflow.
Well of course you can always find values for which it overflows, but we can at least mitigate the issue in the typical case where we have small unsigned values for example. Besides, having a function makes the code way more readable anyway so in my opinion there's no reason not to.
I just submitted PR #81277 for this.
Regarding:
Possibly count = ((to - from - 1) / incr) + 1; & count = ((from - to - 1) / -incr) + 1 in gdscript_utility_functions.cpp
I analyzed what's happening and decided to leave that as is.
The current code deals with a lot of nuances specific to range( from, to, incr )
and modifying it would not only potentially create possible bugs; it would probably also made it harder to understand what's going on.
The PR & this bug report is about having positive numbers that require divide & round up. range( from, to, incr )
deals with a lot more than that.
Update: I totally missed PR #80390 despite yesterday I went to see if it had been handled already :cry:
Godot version
4.2.x master [16a93563bfd3b02ca0a8f6df2026f3a3217f5571]
System information
Godot v4.2.dev (262d1eaa6) - Ubuntu 20.04.6 LTS (Focal Fossa) - X11 - Vulkan (Forward+) - dedicated AMD Radeon RX 6800 XT - AMD Ryzen 9 5900X 12-Core Processor (24 Threads)
Issue description
Based on my findings on #80356 I noticed this pattern repeats throghout the code in order to perform integer division that rounds up:
compute_list_dispatch(p_list, (p_x_threads - 1) / cl->state.local_group_size[0] + 1, (p_y_threads - 1) / cl->state.local_group_size[1] + 1, (p_z_threads - 1) / cl->state.local_group_size[2] + 1);
in rendering_device_vulkan.cpp(multimesh->instances - 1) / MULTIMESH_DIRTY_REGION_SIZE + 1
in mesh_storage.cpp (GLES3 and Vulkan)count = ((to - from - 1) / incr) + 1;
&count = ((from - to - 1) / -incr) + 1
in gdscript_utility_functions.cppVector3i group_size((atlas_size.x - 1) / 8 + 1, (atlas_size.y - 1) / 8 + 1, 1)
in lightmapper_rd.cpp (repeats twice)bitmask.resize((((p_size.width * p_size.height) - 1) / 8) + 1)
in bit_map.cppcluster_screen_size.width = (p_screen_size.width - 1) / cluster_size + 1;
&cluster_screen_size.height = (p_screen_size.height - 1) / cluster_size + 1;
in cluster_builder_rd.cpppush_constant.render_element_count_div_32 = render_element_count > 0 ? (render_element_count - 1) / 32 + 1 : 0;
the code is safe but the check for 0 is not requiredint x_groups = (p_size.x - 1) / 8 + 1;
&int y_groups = (p_size.y - 1) / 8 + 1;
copy_effects.cppuint32_t cluster_screen_width = (p_settings.rb_size.x - 1) / cluster_size + 1;
&uint32_t cluster_screen_height = (p_settings.rb_size.y - 1) / cluster_size + 1;
in fog.cppRD::get_singleton()->compute_list_dispatch(compute_list, (rect.size.x - 1) / 8 + 1, (rect.size.y - 1) / 8 + 1, 1);
in gi.cppcompute_list_dispatch(compute_list, (rect.size.x - 1) / 8 + 1, (rect.size.y - 1) / 8 + 1, 1);
in gi.cppcluster_screen_width = (p_screen_size.width - 1) / p_render_data->cluster_size + 1;
&cluster_screen_height = (p_screen_size.height - 1) / p_render_data->cluster_size + 1;
in render_forward_clustered.cppWhenever the code wants to round up, it performs
x = (x - 1) / y + 1
when it should be doingx = (x + y - 1) / y
The original code when the input is 0 is wrong for both unsigned (produces a very large value) and signed (produces 1 instead of 0)
This version handles 0 properly. (x / y = 0).
e.g.
Mathematically they're the same:
However in terms of computer science, they're not
If no one objects I'll submit a PR next weekend.
Steps to reproduce
None, this is a chronic problem spread out throghout the code.
See #80286 for an example of a repro.
Minimal reproduction project
See #80286