Closed djdeath closed 5 years ago
Ping @bjoeris since this is the init state optimisation code. Is there anything to align regions in the intervals to the nearest 4-byte boundary? It would need to be sensitive to the type of the region so that copies are done instead of fills where we can't fill a whole DWORD.
It does look like there are missing alignment checks here. I will investigate.
https://github.com/baldurk/renderdoc/pull/1511 changes the behaviour here to fall back on vkCmdCopyBuffer when the interval being cleared is not aligned. However, I do not have a capture to verify this fix.
I'll give it a go shortly.
Unfortunately I'm now hitting an assert :
Thread 15 "QThread" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x7fffbbfff700 (LWP 9556)]
raise (sig=<optimised out>) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff4e13dd7 in raise (sig=<optimised out>) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff68b556f in VulkanResourceManager::Serialise_DeviceMemoryRefs<ReadSerialiser>(ReadSerialiser&, std::vector<MemRefInterval, std::allocator<MemRefInterval> >&)
(this=0x7fffb4007eb0, ser=..., data=std::vector of length 1376, capacity 1376 = {...}) at /home/djdeath/mesa/src/renderdoc/renderdoc/driver/vulkan/vk_manager.cpp:436
#2 0x00007ffff6708279 in WrappedVulkan::ProcessChunk(ReadSerialiser&, VulkanChunk) (this=0x7fffb40088f0, ser=..., chunk=VulkanChunk::DeviceMemoryRefs)
at /home/djdeath/mesa/src/renderdoc/renderdoc/driver/vulkan/vk_core.cpp:2959
#3 0x00007ffff67030a0 in WrappedVulkan::ReadLogInitialisation(RDCFile*, bool) (this=0x7fffb40088f0, rdc=0x7fffb4003a10, storeStructuredBuffers=false)
at /home/djdeath/mesa/src/renderdoc/renderdoc/driver/vulkan/vk_core.cpp:2117
#4 0x00007ffff68c9631 in VulkanReplay::ReadLogInitialisation(RDCFile*, bool) (this=0x7fffb40089e0, rdc=0x7fffb4003a10, storeStructuredBuffers=false)
at /home/djdeath/mesa/src/renderdoc/renderdoc/driver/vulkan/vk_replay.cpp:192
#5 0x00007ffff719c8d4 in ReplayController::PostCreateInit(IReplayDriver*, RDCFile*) (this=0x7fffb4004de0, device=0x7fffb40089e0, rdc=0x7fffb4003a10)
at /home/djdeath/mesa/src/renderdoc/renderdoc/replay/replay_controller.cpp:2072
#6 0x00007ffff719c14d in ReplayController::CreateDevice(RDCFile*, ReplayOptions const&) (this=0x7fffb4004de0, rdc=0x7fffb4003a10, opts=...)
at /home/djdeath/mesa/src/renderdoc/renderdoc/replay/replay_controller.cpp:2045
#7 0x00007ffff716bd7c in CaptureFile::OpenCapture(ReplayOptions const&, std::function<void (float)>) (this=0x7fffb4003600, opts=..., progress=...)
at /home/djdeath/mesa/src/renderdoc/renderdoc/replay/capture_file.cpp:370
#8 0x0000555555e1e741 in ReplayManager::run(int, QString const&, ReplayOptions const&, std::function<void (float)>)
(this=0x7fffffffcdd0, proxyRenderer=-1, capturefile=..., opts=..., progress=...) at ../../qrenderdoc/Code/ReplayManager.cpp:437
#9 0x0000555555e1b940 in ReplayManager::<lambda()>::operator()(void) const (__closure=0x7fffbc009cf0) at ../../qrenderdoc/Code/ReplayManager.cpp:55
#10 0x0000555555e1f604 in std::_Function_handler<void(), ReplayManager::OpenCapture(const QString&, const ReplayOptions&, RENDERDOC_ProgressCallback)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297
#11 0x0000555555e22304 in std::function<void ()>::operator()() const (this=0x7fffbc00a860) at /usr/include/c++/8/bits/std_function.h:687
#12 0x0000555555e21cca in LambdaThread::process() (this=0x7fffbc00a850) at ../../qrenderdoc/Code/QRDUtils.h:280
#13 0x0000555555e238a7 in QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, void (LambdaThread::*)()>::call(void (LambdaThread::*)(), LambdaThread*, void**)
(f=(void (LambdaThread::*)(class LambdaThread * const)) 0x555555e21cae <LambdaThread::process()>, o=0x7fffbc00a850, arg=0x7fffbbffee40)
at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:152
#14 0x0000555555e2359f in QtPrivate::FunctionPointer<void (LambdaThread::*)()>::call<QtPrivate::List<>, void>(void (LambdaThread::*)(), LambdaThread*, void**)
(f=(void (LambdaThread::*)(class LambdaThread * const)) 0x555555e21cae <LambdaThread::process()>, o=0x7fffbc00a850, arg=0x7fffbbffee40)
at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:185
#15 0x0000555555e22fe8 in QtPrivate::QSlotObject<void (LambdaThread::*)(), QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase*, QObject*, void**, bool*)
(which=1, this_=0x7fffbc0093d0, r=0x7fffbc00a850, a=0x7fffbbffee40, ret=0x0) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:414
#16 0x00007ffff5151563 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#17 0x00007ffff4f70247 in QThread::started(QThread::QPrivateSignal) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#18 0x00007ffff4f725fb in () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#19 0x00007ffff4e09182 in start_thread (arg=<optimised out>) at pthread_create.c:486
#20 0x00007ffff49e6b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Do I need to take a new capture maybe?
You shouldn't need a new capture. It looks like the iterator walking is going wrong somewhere.
Are you able to share the capture with me privately? I can look at what it contains and see if I can see where it's falling over. It's hard for @bjoeris or myself to test rigorously without an actual repro case. My artificial test showed the problem but not all possible edge cases.
I think I can see something that would cause a problem - in the list of MemRefInterval
, the last referenced region of ResourceId 144 is misaligned at 1246894. The loop below that looks for any remaining references overlapping that dword needs to avoid rolling over into the next resource or hitting the end. So around here:
I think that should look more like:
for(; it_data->start < nextDWord && it_data != data.end() && it_data->memory == mem; ++it_data)
overlapRef = ComposeFrameRefsDisjoint(overlapRef, it_data->refType);
Can you try that patch?
Yep, that fixes it. Thanks!
Yep, I definitely missed the boundary checks there. I think the it_data != data.end()
clause should come first to make sure that that it_data->start
isn't an invalid dereference.
Good catch, yes! Should I make that change locally or do you want to PR it? I don't mind either way.
You can just make the change locally.
Description
Replaying a trace of gfxbench ends up calling vkCmdFillBuffer with non multiple of 4 offset & size. Spec requires both of those parameters to be multiple of 4.
Backtrace :
Repro steps
Just replaying a .rdc trace file. I can provide the file (99Mb).
Environment
Driver Mesa Intel Anv.