open-watcom / open-watcom-v2

Open Watcom V2.0 - Source code repository, Wiki, Latest Binary build, Archived builds including all installers for download.
Other
958 stars 157 forks source link

Stack damage on quiting the block: RedBlack tree sometimes returns a wrong data through std::set or std::map (DOS4GW or CauseWay, also, Windows and Linux too!) #391

Open Wohlstand opened 6 years ago

Wohlstand commented 6 years ago

I have my own project which is works fine on GCC, Clang, DJGPP and MSVC. I have tried to port it to OpenWattcom and build it for DOS (I have tried to use both DOS4GW and CauseWay). It gives an invalid data while using std::map and std::set to insert/search/remove. Because of invalid data (which was an array index), a crash happens later on attempt to use it as array index.

P.S. I have tried to implement some sort of workarounds to use std::vector as base for a fake std::set and std::map implementations to check out what will happen, and I have found that crash still happen even with avoiding of rbtree usage...

P.P.S. I have checked out my project for possible memory leaks via Valgrind, and it have found nothing while long running.

pchapin commented 6 years ago

Can you provide a simple test case that illustrates incorrect results?

Wohlstand commented 6 years ago

Tiny test for std::set I made, unfortunately, doesn't reproduces crash or invalid data nowhere

#include <stdio.h>
#include <map>
#include <set>
#include <stdlib.h>

typedef std::set<int> IntS;

int main()
{
    IntS test;
    IntS tmp;
    int i = 0;
    IntS::iterator it;

    char rotator[] = "-\\|/";

    for(int x = 0; x < 5000; x++)
    {
        int randLimit = rand() % 200000;
        int at = rand() % 50000;
        int count = rand() % 30;
        while(count == 0) // Avoid infinite loop!!!
            count = rand() % 30;

        test.clear();
        tmp.clear();

        printf("%c %d (%d) at: %d; count: %d        \r", rotator[x % 4], x, randLimit, at, count);
        fflush(stdout);

        for(int i = 0; i < randLimit; i++)
            test.insert(i);
        for(int i = at; i < randLimit; i += count)
        {
            tmp.insert(*(test.find(i)));
            test.erase(i);
        }
        test.clear();
        for(it = tmp.begin(); it != tmp.end(); it++)
        {
            test.insert(*it);
        }
        tmp.clear();

        for(i = at, it = test.begin(); it != test.end(); i += count)
        {
            //printf("%d == %d\n", i, *it);
            if(i != *it)
            {
                printf("\nYEAOUCH (1), IT'S PAINFUL! %d is not equal to %d!!!\n\n", *it, i);
                return 1;
            }
            if(i % 2)
            {
                //IntS::iterator c = it;
                //c++;
                test.erase(it++);
                //it = c;
            }
            else
            {
                tmp.insert(*it);
                ++it;
            }
        }

        //printf("-----------\n\n");

        for(IntS::iterator it = test.begin(), jt = tmp.begin(); it != test.end(); it++, jt++)
        {
            if(*it != *jt)
            {
                printf("\nYEAOUCH (2), IT'S PAINFUL! %d is not equal to %d!!!\n\n", *it, *jt);
                return 1;
            }
        }
    }

    printf("\nEverything is ok :3\n\n");

    return 0;
}

However, I have tried to debug my app on Linux or on Windows by disabling SDL linking which I wasn't built via OpenWatcom (I made the WAV dumping instead of play) and I have found a weird freeze caused by infinite loop here: default which happens probably here: https://github.com/Wohlstand/libADLMIDI/blob/master/src/adlmidi_midiplay.cpp#L221

or here while iteration: https://github.com/Wohlstand/libADLMIDI/blob/master/src/adlmidi_midiplay.cpp#L242

Here I made std::set where I doing insert and erase

I'll make a separated branch with including the Watcom IDE project which you will be able to run and debug

EDIT: Just now I have created a branch where I have included an native IDE project (You can fnid it in a projects/watcom folder): https://github.com/Wohlstand/libADLMIDI/tree/openwatcom-debug I trying to debug the workflow myself in parallel

The building utility must generate the WAV file from MIDI file that have been passed as argument of utility.

P.S. On Linux and on Windows the std::fflush(stdout) call causes by unknown reason the SIGSTOP signal 😞 (reason why I have masked it by the macro for the Watcom)

EDIT2: I have tried to trace content of that std::set container from various places and I have found that damage is coming on exiting from for(size_t i = 0; i < anyOther.size(); i++)'s loop and a placed print showing me a totally different data

Wohlstand commented 6 years ago

I did a small test: Recently I have been added some debug prints to illustrate inside of std::vector and std::set on doing various actions. I also modified _rbtree.h to print addresses of "n" and "n->left" to figure out why infinite loop is happens: default So, you see that loop happens because the chain of pointers is a ring: default

rgayoso commented 6 years ago

I also had all kinds of strange problems with set and map, with large objects on it (stlport). I found that the stack space was not committed. I always used 'wlink st=1m' but that was not enough. Everything worked ok when I added 'commit st=1m' It seems that NT automatic stack expansion over already reserved stack area doesn't work.

wlink sys nt op q,c,st=1m,h=1m commit st=1m

Ricardo

Wohlstand commented 6 years ago

Thanks for a tip, @rgayoso , ill try to do that, however, I using owcc as it partially matches GCC's syntax.

jmalak commented 6 years ago

Please could you check PE header what is there? OW linker should setup stack space to 1MB and it should be commited.

Wohlstand commented 6 years ago

Okay, I have been tried to use 1m and then 15m stacks, but result is still be same: I did the debug output and I see the damage happens after quit from one of blocks: default

I.e.:

std::set<size_t> markAsOn; //it's empty at here
loop
{
    if(...)
    {
        //Insert some into markAsOn
        loop
        {
            if(...)
            {
                //Some of elements may be erased here, but in my case no one element was erased
            }
        }           
    }
    //Content of markAsOn is still be same
}
//Damage of markAsOn happen here. Printing of content is not matches previous one.

The generated Win32 EXE is: http://wohlsoft.ru/docs/_files_for_posts/Misc/OpenWattcomDebug/playmidi-debug-crashful.zip

EDIT: @jmalak, I have been checked the PE header: Yeah, it commits 15 MB as I have requested: default

By default header is: default 1) Reserve 1 MB 2) Commit 64 KB 3) Heap reserve size 8 KB

rgayoso commented 6 years ago

Did you do both the reserve and commit?

Wlink ….. op st=1m commit st=1m

I had all kinds of strange errors using set and map, that disappeared when storing ‘pointers to objects’ instead of objects.

After several hours of debugging, map and set use some automatic variables of the type of its objects.

The ‘commit st=1m’ fixed everything. If not present, only 64k is committed.

From: Vitaly Novichkov [mailto:notifications@github.com] Sent: Tuesday, December 5, 2017 4:19 PM To: open-watcom/open-watcom-v2 open-watcom-v2@noreply.github.com Cc: rgayoso reg@alfanuclear.com; Mention mention@noreply.github.com Subject: Re: [open-watcom/open-watcom-v2] RedBlack tree sometimes returns a wrong data through std::set or std::map (DOS4GW or CauseWay, also, Windows and Linux too!) (#391)

Okay, I have been tried to use 1m and then 15m stacks, but result is still be same: I did the debug output and I see the damage happens after quit from one of blocks: https://user-images.githubusercontent.com/6751442/33625956-4b62cd0c-da0a-11e7-8e2b-fb83f0bed3ac.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/open-watcom/open-watcom-v2/issues/391#issuecomment-349411403 , or mute the thread https://github.com/notifications/unsubscribe-auth/AbSMp41ISFb6WcNJtmWC93KzsZkrDLGyks5s9ZcvgaJpZM4QuzZC . https://github.com/notifications/beacon/AbSMp6pqivEM-1NasN1cD4vYUXL_9Xdbks5s9ZcvgaJpZM4QuzZC.gif

Wohlstand commented 6 years ago

Did you do both the reserve and commit?

Yes, I did that, and this wasn't fixed my issue yeah

The ‘commit st=1m’ fixed everything. If not present, only 64k is committed.

This also wasn't helped me: only changed result number I got on damaged memory:

This I got when I used op st=??m: default

and This I got when I using only "commit st=??m": default

And, obviously, as damaged value was used as array index, I got "access violation" crash (aka SIGSEGV) default

The complete command line of wlink: default

P.S. I forgot to tell that to build my project you need for modified "_rbtree.h" and "vector" headers I recently commited to here to make them better match a standard.

P.P.S. Most of working code stored in .LIB file I built separately (in the same project, but as different component).

jmalak commented 6 years ago

I will check thread stack allocation if it is properly growing and fix if necessary.

Wohlstand commented 6 years ago

Recently after long time of waiting and watching, I have tried to debug my program again with most fresh stuff from ow-travis, however, because of this issue I'm no more able to compile Windows and Linux binaries, and I only able to compile DOS32. So, I can't use debugger, except the case I'll run it under DosBox with a fear that DosBox itself will also crash...

Wohlstand commented 6 years ago

Okay, I made the workaround to finally get the building back: I disabled using templates are causing compiler to be crashed. However, the crash of built executable now happens in another place, but it still happen!