tobbebex / GPipe-Core

Core library of new GPipe, encapsulating OpenGl and providing a type safe minimal library
158 stars 30 forks source link

Segfault/memory corruption after reading textures #78

Open xaphiriron opened 3 years ago

xaphiriron commented 3 years ago

I have the following shader code:

type CameraUniform os = Buffer os (Uniform (V4 (B4 Float)))

pickingShader :: (ContextHandler ctx, MonadIO m, MonadException m)
    => V2 Int -> CameraUniform os -> CameraUniform os
    -> ContextT ctx os m (CompiledShader os
        ( PrimitiveArray Triangles (B3 Float, B Word32)
        , (Image (Format Depth), Image (Format RWord))
        ))
pickingShader windowSize cameraBuffer modelview = compileShader $ do
    camera <- getUniform $ \_ -> (cameraBuffer, 0) -- perspective+camera matrix
    view <- getUniform $ \_ -> (modelview, 0) -- model matrix

    primitiveStream <- toPrimitiveStream $ \(v, _) -> v
    fragmentStream <-
        fmap (withRasterizedInfo $ \frag r -> (frag, z $ rasterizedFragCoord r))
        . rasterize (\_ -> (Front, viewport windowSize, DepthRange 0 1))
        . fmap (\(pt, pickId) -> ((camera !*! view) !* w pt 1 , pickId) )
            $ primitiveStream
    drawDepth (\(_, (depthImg, _)) -> (NoBlending, depthImg, DepthOption Less True))
        fragmentStream
        (drawColor $ \(_, (_, colorImg)) -> (colorImg, True, False))

-- write the depth + color map to the two textures, and additionally check to see if there's a picking hit
runPickingImage :: (ContextHandler ctx, MonadIO m, MonadException m, MonadAsyncException m)
    => Texture2D os (Format Depth) -> Texture2D os (Format RWord)
    -> CompiledShader os
        ( PrimitiveArray p (B3 Float, B Word32)
        , (Image (Format Depth), Image (Format RWord))
        )
    -> PrimitiveTopology p -> Buffer os (B3 Float, B Word32) -> V2 Int
    -> ContextT ctx os m (Maybe Word32)
runPickingImage depthTexture colorTexture shader topology vertices cursorPos = do
    render $ do
        depthImage <- getTexture2DImage depthTexture 0
        colorImage <- getTexture2DImage colorTexture 0
        verts <- newVertexArray vertices
        clearImageColor colorImage maxBound
        clearImageDepth depthImage 1
        shader (toPrimitiveArray topology verts, (depthImage, colorImage))
    checkPickingImage colorTexture cursorPos

-- check to see if there's a picking hit without recalculating the picking cache
checkPickingImage :: (ContextHandler ctx, MonadIO m, MonadException m, MonadAsyncException m)
    => Texture2D os (Format RWord) -> V2 Int -> ContextT ctx os m (Maybe Word32)
checkPickingImage colorTexture cursorPos = do
    hit <- readTexture2D colorTexture 0 cursorPos (V2 1 1) (\_ h -> pure h) maxBound
    return $ if hit == maxBound
        then Nothing
        else Just hit

And then the following code is run in the main loop. This is kind of messy since i was in the process of implementing and testing the picking code -- depthTexture and idTexture are both made using newTexture2D earlier, and are the size of the window.

    updatedPick <- case res of
        Nothing -> return Nothing
        Just (_, (_, buffer)) -> if dirty $ info ^. _assets . _camVar
            then do
                liftIO $ putStrLn $ "doing picking recalc (at " <> show (info ^. _assets . _camVar . _lastCursorPos) <> ")"
                -- this forces the picking render to happen with an identity modelview matrix, which is probably fine but will sometimes require special generation of picking coordinates
                shader <- info ^. _assets . _storedPickingShader #$# identity
                pick <- runPickingImage
                    (info ^. _assets . _pickingData . _depthTexture)
                    (info ^. _assets . _pickingData . _idTexture)
                    shader
                    TriangleList
                    buffer
                    (info ^. _assets . _camVar . _lastCursorPos)
                return $ Just pick
            -- buffers haven't changed and camera hasn't changed position, but the cursor pos might have changed, so look up the cached picking texture
            else do
                liftIO $ putStrLn $ "doing picking lookup (at " <> show (info ^. _assets . _camVar . _lastCursorPos) <> ")"
                -- don't try to pick OOB if the cursor is out of the window (which idk if can even happen)
                if inRect (V2 0 0) (floor <$> windowSize) (info ^. _assets . _camVar . _lastCursorPos)
                    then Just <$> checkPickingImage
                        (info ^. _assets . _pickingData . _idTexture)
                        (info ^. _assets . _camVar . _lastCursorPos)
                    else pure Nothing

The code flow here is that when the vertex buffer for the picking function (res) is updated, the first branch of the if is run, and then after that the second branch of the if is run (until the camera moves). Usually it crashes on the initial runPickingImage call, although once I think it made it to a second go-around to just call checkPickingImage. I've gotten segfaults, malloc(): unaligned tcache chunk detected, and corrupted size vs. prev_size so far.

If I comment things out so that runPickingImage doesn't call checkPickingImage, and checkPickingImage isn't called in the main loop, the program doesn't crash. checkPickingImage itself seems to run fine and return a coherent value, but doing some putStrLn debugging makes it seem like the program crashes on the next render action after the texture read. That might just be the next memory allocation; I don't really know.

This seems similar to #48 in that their issue seemed to have something to do with texture reads also; I've tried looking at RenderDoc but I have no clue how to record a frame trace if the program also crashes on that frame.

xaphiriron commented 3 years ago

Looking deeper, the readTexture2D code (and presumably all of the readTexture* functions) is incorrect and will crash or cause memory corruption when the texture isn't read from 0,0 to the texture bounds.

The code as it stands in Graphics.GPipe.Internal.Texture:

readTexture2D t@(Texture2D texn _ ml) l (V2 x y) (V2 w h) f s
    | l < 0 || l >= ml = error "readTexture2D, level out of bounds"
    | x < 0 || x >= mx = error "readTexture2D, x out of bounds"
    | w < 0 || x+w > mx = error "readTexture2D, w out of bounds"
    | y < 0 || y >= my = error "readTexture2D, y out of bounds"
    | h < 0 || y+h > my = error "readTexture2D, h out of bounds"
    | otherwise =
                 let b = makeBuffer undefined undefined 0 :: Buffer os b
                     f' ptr a off = f a =<< liftIO (peekPixel (undefined :: b) (ptr `plusPtr` off))
                 in bracket
                   (liftNonWinContextIO $ do
                     ptr <- mallocBytes $ w*h*bufElementSize b
                     setGlPixelStoreRange x y 0 w h
                     useTexSync texn GL_TEXTURE_2D
                     glGetTexImage GL_TEXTURE_2D (fromIntegral l) (getGlColorFormat (undefined :: c) (undefined :: b)) (getGlType (undefined :: b)) ptr
                     return ptr)
                   (liftIO . free)
                   (\ptr -> foldM (f' ptr) s [0,bufElementSize b..w*h*bufElementSize b -1])
    where V2 mx my = texture2DSizes t !! l

I changed it to the following:

readTexture2D t@(Texture2D texn _ ml) l (V2 x y) (V2 w h) f s
    | l < 0 || l >= ml = error "readTexture2D, level out of bounds"
    | x < 0 || x >= mx = error "readTexture2D, x out of bounds"
    | w < 0 || x+w > mx = error "readTexture2D, w out of bounds"
    | y < 0 || y >= my = error "readTexture2D, y out of bounds"
    | h < 0 || y+h > my = error "readTexture2D, h out of bounds"
    | otherwise =
                 let b = makeBuffer undefined undefined 0 :: Buffer os b
                     f' ptr a off = f a =<< liftIO (peekPixel (undefined :: b) (ptr `plusPtr` off))
                 in bracket
                   (liftNonWinContextIO $ do
                     ptr <- mallocBytes $ mx*my*bufElementSize b
                     setGlPixelStoreRange 0 0 0 mx my
                     useTexSync texn GL_TEXTURE_2D
                     glGetTexImage GL_TEXTURE_2D (fromIntegral l) (getGlColorFormat (undefined :: c) (undefined :: b)) (getGlType (undefined :: b)) ptr
                     return ptr)
                   (liftIO . free)
                   (\ptr -> foldM (f' ptr) s
                     $ fmap ((bufElementSize b *) . ((mx * y + x) +))
                       $ (+) <$> [0,w..mx*(h-1)] <*> [0..w-1])
    where V2 mx my = texture2DSizes t !! l

I'm not super pleased with how that offset list is constructed, but it seems to correctly index the requested texels. This accesses the texels from the bottom-left of the sampled quad.

There might be some way to change this to not allocate the entire texture's worth of memory to do whatever sampling is required, but if there is it's beyond me in the moment.

mrehayden1 commented 2 weeks ago

I think there was a misunderstanding about how glPixelStorei GL_PACK_ROW_LENGTH is meant to be used (and possibly other parameters) when texture reading functions were written, as per the docs.

So far, I haven't been able to find an alternative way to read slices of textures into Ptrs without allocating memory for the entire texture to be able to make this work efficiently.

There are ways to read sub-textures/individual pixels from frame buffers, but they have been abstracted away.

Instead I patched gpipe with this which still allocates the whole texture in memory but reads a single pixel by computing the correct offset from the Ptr.

readPixelTexture2D t@(Texture2D texn _ ml) l (V2 x y)
    | l < 0 || l >= ml = error "readPixelTexture2D, level out of bounds"
    | x < 0 || x >= mx = error "readPixelTexture2D, x out of bounds"
    | y < 0 || y >= my = error "readPixelTexture2D, y out of bounds"
    | otherwise =
        let b = makeBuffer undefined undefined 0 :: Buffer os b
            elementSize = bufElementSize b
            off = (x + y * mx) * elementSize
        in bracket
          (liftNonWinContextIO $ do
            ptr <- mallocBytes $ mx*my*elementSize
            setGlPixelStoreRange 0 0 0 mx my
            useTexSync texn GL_TEXTURE_2D
            glGetTexImage GL_TEXTURE_2D (fromIntegral l) (getGlColorFormat (undefined :: c) (undefined :: b)) (getGlType (undefined :: b)) ptr
            return ptr)
          (liftIO . free)
          (\ptr -> liftIO . peekPixel (undefined :: b) $ ptr `plusPtr` off)
    where V2 mx my = texture2DSizes t !! l

Edit 29/10/2024: This method is very slow. We could do with being able to read directly from FBOs and also added support for asynchronous pixel transfer with PBO.