axmolengine / axmol

Axmol Engine – A Multi-platform Engine for Desktop, XBOX (UWP) and Mobile games. (A fork of Cocos2d-x-4.0)
https://axmol.dev
MIT License
868 stars 195 forks source link

Crashes due to std::string_view use-after-free issues #1002

Closed rh101 closed 1 year ago

rh101 commented 1 year ago

There are several bugs that seem to appear on one platform but not another, and the only thing I can pin-point it to is certain usage of std::string_view.

For example, this function will throw an error when the memory sanitizer is enabled, due to the fact that frameName is invalidated after the call to _spriteFrames.erase(frameName);. The frame name after that point is no longer valid, so in normal operation, it would never find that frame in the _spriteFrameToSpriteSheetMap.

bool SpriteFrameCache::eraseFrame(std::string_view frameName)
{
    _spriteFrames.erase(frameName);  // drop SpriteFrame
    const auto itFrame = _spriteFrameToSpriteSheetMap.find(frameName);
    if (itFrame != _spriteFrameToSpriteSheetMap.end())
    {
        auto& spriteSheet = itFrame->second;
        spriteSheet->full = false;
        spriteSheet->frames.erase(frameName);

        if (spriteSheet->frames.empty())
        {
            _spriteSheets.erase(spriteSheet->path);
        }

        _spriteFrameToSpriteSheetMap.erase(itFrame);  // update index frame->plist

        return true;
    }
    return false;
}

One possible way to fix this is to add a temporary std::string instance of that string_view, such as:

bool SpriteFrameCache::eraseFrame(std::string_view frameName)
{
    const auto tmpFrameName = std::string(frameName);
    _spriteFrames.erase(tmpFrameName);  // drop SpriteFrame
    const auto itFrame = _spriteFrameToSpriteSheetMap.find(tmpFrameName);
    if (itFrame != _spriteFrameToSpriteSheetMap.end())
    {
        auto& spriteSheet = itFrame->second;
        spriteSheet->full = false;
        spriteSheet->frames.erase(tmpFrameName);

        if (spriteSheet->frames.empty())
        {
            _spriteSheets.erase(spriteSheet->path);
        }

        _spriteFrameToSpriteSheetMap.erase(itFrame);  // update index frame->plist

        return true;
    }
    return false;
}

This will fix the issue, but it's not the only place where this seems to be a problem.

Now, on iOS, there is a bigger problem, and this one can reproduced very easily in both the iOS simulator and the real device. If Director::getInstance->getRenderer()->render() is called after a getting a RenderTexture::saveToFile(), then it will crash with invalid memory access to the filename being used. If the address sanitizer is enabled, it will show a use-after-free error in the same location as the iOS crash. The std::string_view is freed prior to the onSaveToFile callback, which is what causes the use-after-free error.

void RenderTexture::onSaveToFile(std::string_view filename, bool isRGBA, bool forceNonPMA)
{
    auto callbackFunc = [&, filename, isRGBA, forceNonPMA](RefPtr<Image> image) {
        if (image)
        {
            if (forceNonPMA && image->hasPremultipliedAlpha())
            {
                image->reversePremultipliedAlpha();
            }
            image->saveToFile(filename, !isRGBA); //<<<<<< filename is NOT valid here: use-after-free error.
        }
        if (_saveFileCallback)
        {
            _saveFileCallback(this, filename);
        }
    };
    newImage(callbackFunc);
}

This can be seen in the cpp-tests `Node::Render Texture, test 1, when "Save Image PMA" or "Save Image Non-PMA" is clicked.

If the call to Director::getInstance->getRenderer()->render() is removed in the button handlers, then there is no crash, as the filename seems to be valid, and the image is saved successfully. For example:

void RenderTextureSave::saveImageWithPremultipliedAlpha(ax::Ref* sender)
{
    static int counter = 0;

    char png[20];
    sprintf(png, "image-pma-%d.png", counter);

    auto callback = [&](RenderTexture* rt, std::string_view path) {
        auto sprite = Sprite::create(path);
        addChild(sprite);
        sprite->setScale(0.3f);
        sprite->setPosition(Vec2(40.0f, 40.0f));
        sprite->setRotation(counter * 3);
        _target->release();
    };

    _target->retain();
    _target->saveToFile(png, Image::Format::PNG, true, callback);
    // Add this function to avoid crash if we switch to a new scene.
    //Director::getInstance()->getRenderer()->render(); //<=== THIS LINE COMMENTED OUT FIXES THE ISSUE
    AXLOG("Image saved %s", png);

    counter++;
}

I don't quite understand why this is happening, or what difference a call to Director::getInstance->getRenderer()->render() would make. I cannot reproduce this on Windows (haven't tested Android yet).

Does anyone have any idea why this would be happening on iOS?

crazyhappygame commented 1 year ago

I think that problem is in line:

void RenderTexture::onSaveToFile(std::string_view filename, bool isRGBA, bool forceNonPMA)
{
    auto callbackFunc = [&, filename, isRGBA, forceNonPMA](RefPtr<Image> image) {

std::string_view filename is captured inside lambda by value. If std::string owning string_view is destroyed - we have undefined behaviour

Personally I think that filename should be passed to onSaveToFile as const std::string&

Antother option would be to copy string_view to string

void RenderTexture::onSaveToFile(std::string_view filenameView, bool isRGBA, bool forceNonPMA)
{
    std::string filename = std::string(filenameView);
    auto callbackFunc    = [&, filename, isRGBA, forceNonPMA](RefPtr<Image> image) {

@halx99 This is critical issue. I think we should revisit usage of std::string_view in entire axmol code. I think that all places were std::string_view is captured by lambda are potential bugs

Please check e.g.: https://abseil.io/tips/1

Because the string_view does not own its data, any strings pointed to by the string_view (just like a const char*) must have a known lifespan, and must outlast the string_view itself. This means that using string_view for storage is often questionable: you need some proof that the underlying data will outlive the string_view.

rh101 commented 1 year ago

std::string_view filename is captured inside lambda by value. If std::string owning string_view is destroyed - we have undefined behaviour

Yes, that is exactly why it's crashing, and the fact that it isn't happening on Windows was a little confusing, but now you mentioned the key words, "undefined behaviour". That seems to be what is happening here.

We'll see what opinion @halx99 has regarding the most appropriate way to fix this, and go from there, because all usage of std::string_view needs to be checked.

crazyhappygame commented 1 year ago

@rh101 Do I understand it correctly that this bug is reproducible 100% with "cpp-tests" and "Node::Render Texture, test 1, when "Save Image PMA" or "Save Image Non-PMA" ?

rh101 commented 1 year ago

@rh101 Do I understand it correctly that this bug is reproducible 100% with "cpp-tests" and "Node::Render Texture, test 1, when "Save Image PMA" or "Save Image Non-PMA" ?

Yes, for both iOS simulator and physical iPhone device builds. I couldn't reproduce the issue with a Windows x64 build.

Also tested a new project, adding the following code added to the init() of the default HelloWorldScene.cpp file that is created:

auto* button = ui::Button::create();
button->setTitleText("SAVE IMAGE");
button->setTitleFontSize(64);
button->setPositionNormalized(Vec2(0.5f, 0.5f));
button->addClickEventListener([&](Ref* sender) {
    auto* rt = RenderTexture::create(100, 100, PixelFormat::RGBA8);
    rt->beginWithClear(0, 0, 0, 255);
    this->visit();
    rt->end();
    rt->saveToFile("testfile.png", true, [](RenderTexture*, std::string_view) {
    });
    Director::getInstance()->getRenderer()->render();
});
addChild(button, 1000);

Run an iOS build, click the "SAVE IMAGE", and it'll crash when attempting to use the filename. If you comment out the line Director::getInstance()->getRenderer()->render();, then for some reason it doesn't crash (at least it didn't for me).

crazyhappygame commented 1 year ago

@rh101 @halx99 I think that it should be possible to run from Github Action cpp-tests executable on Windows and MacOS as last step after build. With that we can detect all crashes similar to this and e.g.: running below one

cpp-test --run-all-tests

https://github.com/axmolengine/axmol/issues/908 Not sure about Linux hosts

crazyhappygame commented 1 year ago

About

Director::getInstance()->getRenderer()->render();

My guess is that with this line we force to call callback immediately and with that we decrease chance of "undefined behavior"

rh101 commented 1 year ago

My guess is that with this line we force to call callback immediately and with that we decrease chance of "undefined behavior"

That seems to be the case.

halx99 commented 1 year ago

I make a fix commit for above issues: https://github.com/axmolengine/axmol/commit/28eb9e1886d037082543dd5fc125b2e4e63ed553

rh101 commented 1 year ago

Excellent, thank you!

crazyhappygame commented 1 year ago

@halx99 Thank you for https://github.com/axmolengine/axmol/commit/28eb9e1886d037082543dd5fc125b2e4e63ed553 I left one question.

I see that you fixed problems described in this ticket but what about other places? Is it possible that we have similar bugs in other places?

rh101 commented 1 year ago

The specific issues listed have been fixed, so closing this.