mmp / pbrt-v4

Source code to pbrt, the ray tracer described in the forthcoming 4th edition of the "Physically Based Rendering: From Theory to Implementation" book.
https://pbrt.org
Apache License 2.0
2.75k stars 407 forks source link

Random crash on CPU rendering #396

Closed zhaoyangyang316 closed 5 months ago

zhaoyangyang316 commented 8 months ago

I got some weird crashing on CPU rendering on Windows 11 w/ VS2023.

It happens randomly during the rendering (not persistently): two types of exceptions (randomly) thrown are

  1. Access violation reading location 0xFFFFFFFFFFFFFFFF.
  2. "Unhandled exception at 0x00007FF6221BFAA5 in pbrt.exe: Stack cookie instrumentation code detected a stack-based buffer overrun"

The scenes I tested are BMW scene and LTE-orb scene (with any path tracer: e.g., volpath).

I was able to debug trace down to two code blocks (it seems to happen less during debug runs, so maybe there are more somewhere else):

  1. In pbrt/util/sampling.h: line 1689: Float v00 = lookup<Dimension>(m_data.data(), index, size, param_weight),
  2. In pbrt/util/pstd.h: line 114: values[i++] = val;

I also tested on macos with apple chips, this bug only happens on windows.

Has anyone encountered this bug?

Update:

mmp commented 8 months ago

Those are unusual and I haven't heard of them before. In general pbrt should always be executing deterministically (and so shouldn't have random crashes) like that. My guess would be that it is a hardware or memory issue. You might try running a memory test?

zhaoyangyang316 commented 8 months ago

Those are unusual and I haven't heard of them before. In general pbrt should always be executing deterministically (and so shouldn't have random crashes) like that. My guess would be that it is a hardware or memory issue. You might try running a memory test?

Yes that's the confusing part for me as well. I found another Win machine, and everything runs smoothly. My hardware are fairly new, it may also relate to windows defender.

mmp commented 5 months ago

Closing this since it looks like a hardware/memory failure and others haven't reported anything similar.