cyrus-and / gdb-dashboard

Modular visual interface for GDB in Python
MIT License
10.99k stars 768 forks source link

OverflowError: int too big to convert #272

Closed hijkzzz closed 2 years ago

hijkzzz commented 2 years ago

ENV: Ubuntu 20.04 + GDB 9.2

The following prompt pops up while I'm debugging

─── Output/messages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
testing::RunInferTest (argc=0, argv=0x7ffe355717d0) at tests/unitTests/driver.cpp:876
876     {
Traceback (most recent call last):
  File "/lib/x86_64-linux-gnu/../../share/gcc/python/libstdcxx/v6/printers.py", line 903, in to_string
    return ptr.lazy_string (length = length)
OverflowError: int too big to convert
cyrus-and commented 2 years ago

Does it happen without the dashboard too?

hijkzzz commented 2 years ago

Does it happen without the dashboard too?

No. I use 'dashboard -enabled off', then everything is working fine.

cyrus-and commented 2 years ago

Weird, since the error is in that Python file (which is not the dashboard). Anyway, can you tell me how to reproduce this?

hijkzzz commented 2 years ago

I used `gdb-dashboard`` to debug TensorRT, which seems not easy to reproduce (We can't open source)

Weird, since the error is in that Python file (which is not the dashboard). Anyway, can you tell me how to reproduce this?

cyrus-and commented 2 years ago

The dashboard simply prints some values, and that libstdcxx printer is used to do that, so the real test is not to just debug the app with dashboard -enabled off, rather you have to go to the same spot where the error arise and print the culprit variable or argument.

hijkzzz commented 2 years ago

This error is caused by the Variables module. I use 'dashboard variables' to disable this module, everything works fine.

Then I enable it:

>>> dashboard variables
Traceback (most recent call last):
  File "/lib/x86_64-linux-gnu/../../share/gcc/python/libstdcxx/v6/printers.py", line 903, in to_string
    return ptr.lazy_string (length = length)
OverflowError: int too big to convert

The error occurs in function

bool RunInferTest(int32_t const argc, char const** argv)
{
    std::shared_ptr<Options> opts{new Options};
    std::shared_ptr<Capabilities> caps{new Capabilities};
    std::unordered_set<std::string> msgs;

    auto ret = cudaFree(nullptr);
    if (ret != cudaSuccess)
    {
        std::cout << "CUDA initialization failed with error code " << ret << ": " << cudaGetErrorString(ret)
                  << " Please make sure cuda is setup correctly." << std::endl;
        return false;
    }

    dumpCmdline(argc, argv);

    bool useSeed = false;

Log of Variables:

─── Variables ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
arg argc = 0, argv = 0x7fffe0930730: 0 '\000'
loc opts = std::shared_ptr<testing::Options> (use count 904578768, weak count 21859) = {get() = 0x556435ec8a50}, caps = <error reading variable: Cannot access memory at address 0xd>, msgs = std::unordered_set with 2 elements<error reading variable: Cannot access memory at address 0xa83e40f…, ret = cudaSuccess, useSeed = 147, hwCtx = {gpuProps = {mModelSpec = {major = -527234912,minor = 32767,maxCoreClockRate = 771200677,maxMemoryCl…, allTests = std::vector of length 430, capacity 2392 = {[0] = std::unique_ptr<testing::Test> = {get() = 0xffff77…, filters = std::vector of length -90277778638, capacity 45046222438 = {[0] = {exclude = 36,pat = ,subtestMask =…, numSelectedTests = 1, generator = {static multiplier = <optimized out>,static increment = <optimized out>,static modulus = <optimized …, seedDistribution = {_M_param = {_M_a = 2,_M_b = 0}}, holder = std::unique_ptr<nvinfer1::IBuilder> = {get() = 0x7fffe0930c08}, status = 224

I'm not sure yet which variable is causing this error

The dashboard simply prints some values, and that libstdcxx printer is used to do that, so the real test is not to just debug the app with dashboard -enabled off, rather you have to go to the same spot where the error arise and print the culprit variable or argument.

cyrus-and commented 2 years ago

Yes, now disable that module, or the dashboard altogether, and try to manually print all the variables and arguments from the GDB prompt, e.g., print argc, print argv, print opts, etc.

hijkzzz commented 2 years ago

I find that filters is a very long vector

    using GTestFilterVec = std::vector<GTestFilterEntry>; 

    // Set up filtering patterns
    GTestFilterVec filters;
    try
    {
        if (!opts->testListFile.empty())
        {
            filters = parseTestListFile(opts->testListFile);
        }
        else
        {
            filters = parseGtestFilter(opts->gtestFilter);
        }
    }

print filters.size()
$21 = -91053350784

For extremely long vectors, int overflows? How can I fix this problem?

I noticed that VSCode cpptools truncates this vector to show only the first 1000 items~

Yes, now disable that module, or the dashboard altogether, and try to manually print all the variables and arguments from the GDB prompt, e.g., print argc, print argv, print opts, etc.

hijkzzz commented 2 years ago

Hi, I reproduce this error (in VSCode cpptools terminal)

-exec print filters
$2 = std::vector of length -90233658511, capacity 45087910289 = {[0] = {
    exclude = 36,
Traceback (most recent call last):
  File "/lib/x86_64-linux-gnu/../../share/gcc/python/libstdcxx/v6/printers.py", line 903, in to_string
    return ptr.lazy_string (length = length)
OverflowError: int too big to convert

It seems like a truncation here would speed things up a lot, and avoid this kind of error

cyrus-and commented 2 years ago

If you're allowed to edit that file, yes, patching it seems the obvious solution. Also you might want to check wether this is a known issue, or if you're using the latest version.

hijkzzz commented 2 years ago

Yes, I just installed this GDB plugin today. Where can I truncate the infos of a long array or vector(such as size = 1000000000000000) in Variables module~

If you're allowed to edit that file, yes, patching it seems the obvious solution. Also you might want to check wether this is a known issue, or if you're using the latest version.

cyrus-and commented 2 years ago

Also you might want to check wether this is a known issue, or if you're using the latest version.

The error in not in my code, I was talking about the C++ pretty printer. You might want to patch this.

hijkzzz commented 2 years ago

But truncating the extremely long vectors in your code seems like a good thing (VSCode cpptools does this)~ I found that the above error did not affect GDB running , but it was very, very slow as your code would have to print the long vector in each time.

Also you might want to check wether this is a known issue, or if you're using the latest version.

The error in not in my code, I was talking about the C++ pretty printer. You might want to patch this.

cyrus-and commented 2 years ago

There's already a feature that does that (dashboard -style max_value_length) but it operates after a value is received, hence I think, after the error. The dashboard is not aware about what a vector is, this is IMHO something to be handled at the pretty printer level.

Additionally, if a value resembles an array then there are some native GDB options that you can use and that are honoured by the dashboard, like set print elements, see here. The fact that those are not taken into consideration for printing those vectors, it once again suggests that's an issue with the pretty printer.

hijkzzz commented 2 years ago

Thanks

There's already a feature that does that (dashboard -style max_value_length) but it operates after a value is received, hence I think, after the error. The dashboard is not aware about what a vector is, this is IMHO something to be handled at the pretty printer level.

Additionally, if a value resembles an array then there are some native GDB options that you can use and that are honoured by the dashboard, like set print elements, see here. The fact that those are not taken into consideration for printing those vectors, it once again suggests that's an issue with the pretty printer.

cyrus-and commented 2 years ago

So I'm closing this as there's nothing for me to fix. Feel free to comment if you come up with something. Thank you.