shinjiogaki / bvh

BVH STAR in Japanese
80 stars 7 forks source link

possible datarace on concurrent bbox expand #3

Open OlegBliznuk opened 2 years ago

OlegBliznuk commented 2 years ago

Hello, I have randomly underexpanded bboxes if i use bvh_binary.cpp: 73 std::for_each(std::execution::par, leaves.begin(), leaves.end(), [&](glm::uvec3 &leaf) these two screenshots from two different launches of the same build image image

however if i change the execution policy to sequential (std::execution::seq) everything is ok, the problem mainly manifests itself on low-poly inputs. Resulting tree is valid in both cases as i can see, the problem only in bbox calculation. I ran this on 112 threads workstation.

Also my code for test intersection if any, counts hits and outputs as brightness. Could the problem be on my side ?

inline int TEST_intersect_bvh_ray(int  ndIdx, const  TEST :: LBVH& acc, const triangle* tris, const vec3 ro, const vec3 rd)
    {
        const auto * nd = &acc.Nodes[ndIdx];
        int nbHits = 0;
        float tnear, tfar;
        if (intersect(aabb(&nd->Box.Min.x, &nd->Box.Max.x), ro, rd, tnear, tfar) == false)
            return 0;

        if ((nd->L & (1))) // is leaf ?
        {
            vec3 uvw;
            float  tmax = _FLT_MAX;
            if (intersect_triangle_ray(tris[(nd->L - 1) / 2], ro, rd, uvw, tmax))
                nbHits++;
        }
        else
        {
            nbHits += TEST_intersect_bvh_ray(nd->L / 2, acc, tris, ro, rd);
        }

        if ((nd->R & (1))) // is leaf ?
        {
            vec3 uvw;
            float  tmax = _FLT_MAX;
            if (intersect_triangle_ray(tris[(nd->R - 1) / 2], ro, rd, uvw, tmax))
                nbHits++;
        }
        else
        {
            nbHits += TEST_intersect_bvh_ray(nd->R / 2, acc, tris, ro, rd);
        }

        return nbHits;
    }
OlegBliznuk commented 2 years ago

I think Nodes[parent].Box.Expand(aabb); should be moved above the other_bounds exchange (in both IF branches) where we detect who exactly from two arrived children/threads is the first and will stop, so we get explicit sync and there is no case with data race like: thread L(second arrived) updates the parent bx thread L reads back the parent bx BUT at this moment thread R (first arrived and going to exit soon) still hasn't updated the parent box and so L has incompleted box data. At least this resolved the problem for me