Open Axel-Reactor opened 2 months ago
I tried editing the shader and the doesn't seem to be a clear cause and effect. If I remove different pieces of code it passes. It seems like there is a certain threshold of imageStore
/imageLoad
calls that causes it to fail.
I compiled the validation layers in debug, and it's calling throwOverflowError
in robin hood here:
// we don't retry, fail if overflowing
// don't need to check max num elements
if (0 == mMaxNumElementsAllowed && !try_increase_info()) {
throwOverflowError();
}
try_increase_info
fails because mInfoInc
is 2
bool try_increase_info() {
ROBIN_HOOD_LOG("mInfoInc=" << mInfoInc << ", numElements=" << mNumElements
<< ", maxNumElementsAllowed="
<< calcMaxNumElementsAllowed(mMask + 1))
if (mInfoInc <= 2) {
// need to be > 2 so that shift works (otherwise undefined behavior!)
return false;
}
I have no idea, is this a bug in the hash map implementation?
This actually goes away if I replace the robin hood set with STL in this case, which is extremely upsetting:
thanks for looking into this
spirv-val
throws no errors)result_ids
?spirv-val --scalar-block-layout --target-env vulkan1.3 C:\Users\...\shader.spv
returns no errorsif possible could you try
// Try to add to the output set
- if (!result_ids.insert(worklist_id).second) {
- continue; // If we already saw this id, we don't want to walk it again.
+ if (result_ids.contains(worklist_id)) {
+ continue;
+ } else {
+ result_ids.insert(worklist_id);
}
without knowing any internal of how robin hood works, only thought is if there is an issue when keep trying to insert duplicate entries
Same issue, still crashes on the insert: State of the hash map:
It consistently crashes with 689 entries
I'm very confident I have thrown large shaders with over 700 entries for this. I assume you are on Windows 11?
The best thing I can do without the SPIR-V and make sure a large enough shader can not crash at 689
entries
Yes, Windows 11, but I don't see how that's relevant? Let me try if this happens with stripped SPIR-V, I can probably give that to you.
Alright, here is the obfuscated SPIR-V, crashes in the same way for me raygen_rs-0x694a1322c182da48.zip
so quick update, I was able to reproduce the crash... I found removing the 10,000 line OpSource
fixed it, so now think this not an issue with the hashmap, but how we might be storing the OpSource
for such a large shader
edit - actually just going spirv-dis
and then right away going spirv-as
fixes it ...
if I go spirv-dis --raw-id
and then spirv-as --preserve-numeric-ids
it will still crash as normal
more update, wrote a test capturing the IDs
#include <array>
TEST_F(VkPositiveLayerTest, RobinHood) {
vvl::unordered_set<uint32_t> result_ids;
std::array<uint32_t, 704> ids = { /* dumped out */ };
for (auto id : ids) {
if (!result_ids.insert(id).second) {
}
}
}
and it works fine, then I tried going
- vvl::unordered_set<uint32_t> worklist;
+ std::unordered_set<uint32_t> worklist;
and it worked... something is going on having 2 robin hood uint32_t hashes going together in the same scope, trying to figure out why this is the case
Environment:
Describe the Issue I'm hitting a crash with a compute shader in
spirv::EntryPoint:GetAccessibleIds
I know this is probably not terribly helpful without the SPIR-V, but I'm not at liberty to provide that. Maybe someone can take a guess. Happy to provide more info if needed.
One thing I noticed is that the hash map is resizing (
rehashPowerOfTwo
). Maybe that's a hint. Although this is only auint32
set, I don't see how that even could get corrupted.