HISKP-LQCD / sLapH-contractions

Stochastic LapH contraction program
GNU General Public License v3.0
3 stars 3 forks source link

on some machines, std::runtime_error does not print "what()" #101

Open kostrzewa opened 5 years ago

kostrzewa commented 5 years ago

On some machines (we have at least one example), std::runtime_error does not print its what() field when thrown (and not caught).

This, of course, makes locating the exception basically impossible (when not running in a debugger).

Possible solutions

1) create our own exception class which inherits from std::runtime_error and ALWAYS prints upon construction 1) catch all exceptions and handle them 1) don't use exceptions for things which are downright errors

what do you think @martin-ueding ?

kostrzewa commented 5 years ago

@martin-ueding Is there something that I am missing? I.e., is there a way to force compilers to always print the what() field before crapping out?

kostrzewa commented 5 years ago

Maybe we could move to this as a general way of handling runtime errors: https://stackoverflow.com/a/47618868 ?

kostrzewa commented 5 years ago

I just added the following exception classes to my fork of CVC and they work exactly as expected:

#include <stdexcept>
#include <string>
#include <iostream>

namespace cvc {

class runtime_error : public std::runtime_error {
  public:
    runtime_error( const std::string& what_arg, const std::string& function_name ) :
      std::runtime_error(what_arg)
    {
      std::cout << "[cvc::runtime_error]: exception in " << 
        function_name << " : " <<
        what_arg << std::endl; 
    }
};

class invalid_argument : public std::invalid_argument {
  public:
    invalid_argument( const std::string& what_arg, const std::string& function_name ) :
      std::invalid_argument(what_arg)
    {
      std::cout << "[cvc::invalid_argument]: exception in " <<
        function_name << " : " <<
        what_arg << std::endl;
    }
};

} // namespace(cvc)
martin-ueding commented 5 years ago

create our own exception class which inherits from std::runtime_error and ALWAYS prints upon construction

That does not really help with all the other exceptions.

don't use exceptions for things which are downright errors

I always flinch when I read “don't use exceptions” and think of nightmares like this guy not using exceptions, then realizing that he cannot propagate errors from the ctor, then stops using ctors and creates an int init() function, has partially constructed objects and just makes it worse with every step.

Okay, if we call std::abort() right after, I'd be fine. But I hate error codes with a passion. So if we have the following stop and use it just like in R, I can live with that.

void stop(std::string const &what) {
  std::cerr << what << std::endl;
  std::abort();
}

catch all exceptions and handle them

What about just this here? Appears like best of both worlds:

int main(int const argc, char const* const* const argv) {
  try {
    real_main(argc, argv);
  } except(std::exception const& e) {
    std::cerr << e.what() << std::endl;
    std::rethrow_exception(e);
  }
}

I haven't tested this, but something along these lines. Briefly catching everything, printing the what do the console and then re-throwing it such that one can catch it in a debugger.

References:

kostrzewa commented 5 years ago

I always flinch when I read “don't use exceptions”

Exception handling is great to gracefully and centrally handle undefined behaviour, but you need careful exception handling code, which we don't have, at all.

Throwing unhandled exceptions is the worst practice, in my opinion, because it results in issues such as this one where it becomes impossible to figure out where the problem occurred (without a debugger). Having terminated would have been much better. Relying on a debugger is not good for us in general because the issues that we face may occur on machines where running in the debugger causes significant human overhead or is just plain impossible (when interactive sessions don't work, for example, as has been the case for years on several of the machines that we've been using...)

The other issue with exceptions is their poor interaction with parallelism, but there is at least some work underway to have at least rudimentary support for exception handling in a parallel way (i.e., being able to figure out which task suffered an exception and being able to handle these centrally or locally, whatever is appropriate) ( https://arxiv.org/abs/1804.04481 )

That does not really help with all the other exceptions.

Almost all exceptions in the code should be runtime or invalid argument ones, so one would really just create two own exception classes. But I see your point if one doesn't want to implement own exception classes anyway...

What about just this here? Appears like best of both worlds:

Yes, that's probably a reasonable "cheap" solution which does not require proper exception handling which we simply can't afford to manage with the developer resources that we have. One should just make damn sure that the "what" strings include the function or object name such that things can be traced back to the source.

But I hate error codes with a passion.

Agreed, although the "what" string is nothing else but a fancy error code and the exception type is much like error codes of old.

martin-ueding commented 5 years ago

I have to think about this more, I thought that I had figured it out but now you challenged it 🤔.

Throwing unhandled exceptions is nice in Python because the interpreter will just catch all leftovers and print the stack trace, which is just really nice. In C++ I do not have that, but since I can use the debugger I get the same information with reasonable effort on my workstation. I get the full call stack and do not have to retrieve this information manually. And I can just enter the stack frames and look at all the variables. If we would just print something and exit with like std::exit(1), I would find it harder to point the debugger to the problematic part, therefore I like the uncaught exception because I do not need to set a breakpoint.

I haven't researched it, but I think that emitting a stack trace and then exiting the code would be the easiest. But inserting a breakpoint would not be exactly the same, because some low level function might only fail for some instances.

What would you think of printing the stack trace and re-throwing the exception?

kostrzewa commented 5 years ago

What would you think of printing the stack trace and re-throwing the exception?

Good luck doing that when many things end up inlined, as they should be for performance. That's one of the problems with exceptions in C++ specifically. Similarly, when you want to be exception-safe, you used to have to make everything copy-constructable. Luckily, move semantics have taken care of this now, but the code becomes more complicated as a result (std::move & co are not what I would call easy to use)

If we would just print something and exit with like std::exit(1), I would find it harder to point the debugger to the problematic part, therefore I like the uncaught exception because I do not need to set a breakpoint.

Agreed. One could add info like __LINE__ and __FILE__, but that's not portable (CVC uses this quite effectively, as long as one is on a compiler that knows the macros, good fun rewriting it all when a compiler comes along that does not know about them...).

Anyway, I like the idea of exceptions, but in practice I haven't seen a scientific code yet that handles them properly simply because of the effort involved and the type of motivation that scientific codes are written with (the goal is usually not to develop a system for solving problems, but to solve a particular problem, publish a paper or two and maybe, just maybe, be able to reuse the ideas and perhaps even the code at a later stage to solve other problems.)

And then there are horrors like the C++ HDF5 lib, which apparently uses exceptions for absolutely everything, even if it isn't exceptional?!

kostrzewa commented 5 years ago

One final thing that I wanted to say is about the situations in which we throw exceptions: I would say that the usefulness of exceptions is that one can handle them in a program and react to an error state.

However, in most of the exceptions that we throw here (or that I throw in my CVC fork), the intended "catcher" is not the exception handling code but the user, who has to read and understand that their input was incorrect and that, as a result, some particular part of the program failed to do what they wanted it to do.

The actual importance is thus in the error message and how it relates back to the input performed by the user.

This is of course not always the case and we do have exceptions which guard against programming errors or system problems, but in most cases we are handling inconsistent, incomplete or otherwise erroneous input....

martin-ueding commented 5 years ago

Good luck doing that when many things end up inlined, as they should be for performance.

My current approach is to have a debug and release build and just re-run the issue with the debug build. For the contraction code and the test cases this works reasonably quickly.

std::move & co are not what I would call easy to use

There is a nice talk by Scott Meyers who stresses that “move does not move” and by now I understand that it basically is rvalue_cast, but it takes a while.

but that's not portable

One could just use a macro for that, the assert works the same way. This way one can make it easily adaptable, though.

motivation that scientific codes are written with

The stuff that the previous PhD student did in their limited knowledge was all convoluted nonsense, so I am going to rewrite it all because I have so much more klout? And a year later I realize that the thing was not that easy after all and produce a dead end spaghetti code contraption.

However, in most of the exceptions that we throw here (or that I throw in my CVC fork), the intended "catcher" is not the exception handling code but the user, who has to read and understand that their input was incorrect and that, as a result, some particular part of the program failed to do what they wanted it to do.

Perfect, thanks for pointing this out. I got a bit lost in my fuzzy overlap of user and programmer. You are right! Take for instance the empty results in the operator*. This happens either due to a bug or more likely because there are just not enough random vectors. I eventually beefed up the exception's what in order to make sure that the user can resolve this issue without having to ask one of the developers.

Then I have this suggestion: We have some stop macro, which foremost prints a user oriented error message, like the note about insufficient random vectors. Then by virtue of being a macro also emits __FILE__ and __LINE__ and the git commit has and the git build status in a stanza “developer information”. The user can then either fix it themselves or go to one of the developers and bring some helpful information along.