accellera-official / systemc

SystemC Reference Implementation
https://systemc.org/overview/systemc/
Apache License 2.0
476 stars 152 forks source link

problems with RAII in C++ >= 11 #28

Open Xeverous opened 2 years ago

Xeverous commented 2 years ago

I have a project in work where we manage some objects using fifo and because systemc API does not support move operations there is a problem with expensive and/or move-only objects.

Currently the issue is resolved by very ugly code like:

// producer place
fifo.write(new T(...));

// consumer place
auto ptr = fifo.read();
// use ptr...
delete ptr;

I would like to avoid pointer/lifetime issues yet I know that API is designed only with C++98/03 in mind. So I tried to use smart pointers which should keep any resources correctly managed.

I have made a reproducible example based on http://learnsystemc.com/basic/channel_fifo which showcases leaks - 4 objects remain even though I used only smart pointers:

#include <systemc>
#include <memory>
#include <utility>
using namespace sc_core;
using std::cout;

class blabber
{
public:
    blabber(int x): x(x)
    {
        cout << "ctor: " << x << "\n";
        ++instances;
    }

    blabber(const blabber& other)
    : x(other.x)
    {
        cout << "copy ctor: " << x << "\n";
        ++instances;
    }

    blabber(blabber&& other) noexcept
    : x(std::exchange(other.x, -1))
    {
        cout << "move ctor: " << x << "\n";
        ++instances;
    }

    blabber& operator=(blabber other) noexcept
    {
        std::swap(x, other.x);
        cout << "assignment: " << x << "\n";
        return *this;
    }

    ~blabber()
    {
        cout << "dtor: " << x << "\n";
        --instances;
    }

    operator int() { return x; }

    static int get_num_instances() { return instances; }

private:
    int x = -1;
    static int instances;
};

int blabber::instances = 0;

SC_MODULE(FIFO) {
    sc_fifo<std::shared_ptr<blabber>> f1, f2, f3;
    SC_CTOR(FIFO) : f1(2), f2(2), f3(2) { // fifo with size 2
        SC_THREAD(generator1);
        SC_THREAD(consumer1);

        SC_THREAD(generator2);
        SC_THREAD(consumer2);

        SC_THREAD(generator3);
        SC_THREAD(consumer3);
    }
    void generator1() { // blocking write
        int v = 10;
        while (true) {
            f1.write(std::make_shared<blabber>(v)); // same as f = v, which is not recommended.
            std::cout << sc_time_stamp() << ": generator1 writes " << v++ << std::endl;
            wait(1, SC_SEC); // write every 1 s
        }
    }
    void consumer1() { // blocking read
        while (true) {
            auto ptr = f1.read();
            std::cout << sc_time_stamp() << ": consumer1 reads " << *ptr << std::endl;
            wait(3, SC_SEC); // read every 3 s, fifo will fill up soon
        }
    }
    void generator2() { // non-blocking write
        int v = 20;
        while (true) {
            auto ptr = std::make_shared<blabber>(v);
            while (f2.nb_write(ptr) == false ) { // nb write until succeeded
                wait(f2.data_read_event()); // if not successful, wait for data read (a fifo slot becomes available)
            }
            std::cout << sc_time_stamp() << ": generator2 writes " << v++ << std::endl;
            wait(1, SC_SEC); // write every 1 s
        }
    }
    void consumer2() { // non-blocking read
        while (true) {
            std::shared_ptr<blabber> ptr;
            while (f2.nb_read(ptr) == false) {
                wait(f2.data_written_event());
            }
            std::cout << sc_time_stamp() << ": consumer2 reads " << *ptr << std::endl;
            wait(3, SC_SEC); // read every 3 s, fifo will fill up soon
        }
    }
    void generator3() { // free/available slots before/after write
        int v = 30;
        while (true) {
            std::cout << sc_time_stamp() << ": generator3, before write, #free/#available=" << f3.num_free() << "/" << f3.num_available() << std::endl;
            f3.write(std::make_shared<blabber>(v++));
            std::cout << sc_time_stamp() << ": generator3, after write, #free/#available=" << f3.num_free() << "/" << f3.num_available() << std::endl;
            wait(1, SC_SEC);
        }
    }
    void consumer3() { // free/available slots before/after read
        while (true) {
            std::cout << sc_time_stamp() << ": consumer3, before read, #free/#available=" << f3.num_free() << "/" << f3.num_available() << std::endl;
            auto ptr = f3.read();
            std::cout << sc_time_stamp() << ": consumer3, after read, #free/#available=" << f3.num_free() << "/" << f3.num_available() << std::endl;
            wait(3, SC_SEC); // read every 3 s, fifo will fill up soon
        }
    }
};

int sc_main(int, char*[]) {
    {
        FIFO fifo("fifo");
        sc_start(10, SC_SEC);
    }

    cout << "instances: " << blabber::get_num_instances() << "\n";
    return 0;
}

execution log:


        SystemC 2.3.4_pub_rev_20191203-Accellera --- Mar  7 2022 16:00:29
        Copyright (c) 1996-2019 by all Contributors,
        ALL RIGHTS RESERVED
ctor: 10
0 s: generator1 writes 10
ctor: 20
0 s: generator2 writes 20
0 s: generator3, before write, #free/#available=2/0
ctor: 30
0 s: generator3, after write, #free/#available=1/0
0 s: consumer3, before read, #free/#available=1/0
0 s: consumer1 reads 10
0 s: consumer2 reads 20
0 s: consumer3, after read, #free/#available=1/0
ctor: 11
1 s: generator1 writes 11
ctor: 21
1 s: generator2 writes 21
1 s: generator3, before write, #free/#available=2/0
ctor: 31
1 s: generator3, after write, #free/#available=1/0
ctor: 12
2 s: generator1 writes 12
ctor: 22
2 s: generator2 writes 22
2 s: generator3, before write, #free/#available=1/1
ctor: 32
2 s: generator3, after write, #free/#available=0/1
dtor: 30
3 s: consumer3, before read, #free/#available=0/2
3 s: consumer3, after read, #free/#available=0/1
3 s: generator3, before write, #free/#available=0/1
ctor: 33
ctor: 23
ctor: 13
dtor: 10
3 s: consumer1 reads 11
dtor: 20
3 s: consumer2 reads 21
3 s: generator3, after write, #free/#available=0/1
3 s: generator1 writes 13
3 s: generator2 writes 23
4 s: generator3, before write, #free/#available=0/2
ctor: 34
ctor: 14
ctor: 24
dtor: 11
6 s: consumer1 reads 12
dtor: 31
6 s: consumer3, before read, #free/#available=0/2
6 s: consumer3, after read, #free/#available=0/1
dtor: 21
6 s: consumer2 reads 22
6 s: generator1 writes 14
6 s: generator3, after write, #free/#available=0/1
6 s: generator2 writes 24
ctor: 15
7 s: generator3, before write, #free/#available=0/2
ctor: 35
ctor: 25
dtor: 32
9 s: consumer3, before read, #free/#available=0/2
9 s: consumer3, after read, #free/#available=0/1
dtor: 12
9 s: consumer1 reads 13
dtor: 22
9 s: consumer2 reads 23
9 s: generator3, after write, #free/#available=0/1
9 s: generator1 writes 15
9 s: generator2 writes 25
dtor: 35
dtor: 34
dtor: 24
dtor: 15
dtor: 14
instances: 4

Using GCC Ubuntu 9.4.0-1ubuntu1~20.04.1

lanphon commented 1 year ago

I've reproduced the same error on macOS with clang. I guess you also used the same systemc-build with quickthread, which could explain why there are still 4 objects left. The deep reason is how SystemC implements the coroutine. With quick-thread, each sc_thread would have its own call-stack, allocating from the heap memory directly. When simulation is done, all the allocated memory part would be freed naturally. But if there is still some smart pointer on the callstack, this memory free cannot know it, which causing the leak. For this example, it's cleared that if an object has a smart pointer referred, and if the smart pointer is on the stack, it will leak. For example, consumer1/2/3 both has a ptr stored on the stack part, and the last object referred by those stacks(13/23/33) will be leaked. On the other hand, the producer2 would also hold a smart pointer for the object created, the value of which is 25, leading to leak finally. So in the end, 4 objcts are leaked. This leaking is hard to resolve with quickthread. Maybe you can try systemc with pthread solution, with a performance drop as the cost. But I think the leak will only happens when simulation is stopped iin the end, so this may not be a problem from this perspective, since OS would help to de-allocated all the leaked resource finally.

Xeverous commented 1 year ago

I see 3 possible solutions:

1) (ideal) just support C++11 and move semantics

2) use "a more correct" implementation (I don't know much about the inner workings of SystemC so it can be just wishful thinking, on the other hand it would be great if it was just a bug)

3) (probably high cost or impossible given API constraints) rewrite the implementation to fully support C++ semantics. I'm suspicious something shady is happening inside because the execution itself is single-threaded AND it can perform arbitrary jumps without use of exceptions - just like in my example, the library is able to exit an infinite loop. I have no knowledge of quickthread but the potential usage of some call stack shenanigans might be the reason for the library's magic ability to exit "unexitable" code and to break lifetime semantics. I'm still surprised that the specification allows (or forces) an API that has to tamper with the call stack. I would like to see at least a justification why it's designed this way (magic jumps) and not something like while (sc_core::is_running()).

lanphon commented 1 year ago

The magic ability to jump without exception is implemented with assemble code, instead of using existing API, in quickthread cases. The concept of this, known as coroutine, has just been introduced into C++ for C++20(and C++20 defines a stackless coroutine). I wonder if the Systemc could be rewrite with standard solution if possible, but it seems cost too much.

Xeverous commented 1 year ago

I do business with her as well, but she has asked me to keep those dealings private for now...

I'm more concerned with the fact that SystemC specification (which defines an API for implementers) has never been updated even to reflect C++11. The design of the API still stays on 90/03 level.