STEllAR-GROUP / hpx

The C++ Standard Library for Parallelism and Concurrency
https://hpx.stellar-group.org
Boost Software License 1.0
2.54k stars 438 forks source link

Exception 'what' strings are lost when exceptions from decode_parcel are reported #304

Closed brycelelbach closed 12 years ago

brycelelbach commented 12 years ago

[reported by blelbach] [Trac time Fri Feb 3 20:27:43 2012] ||2f6ef8319fd3192da671f9eca9d4da96253866b9, Boost 1.47.0, GCC 4.6.2, debug

This issue can be reproduced by running neutron star in distributed, with the following options:


Locality 1:

/path/to/neutron_star -0 -l2 --options-file /path/to/idstar --debug-hpx-log='file(hpx.0.log)'

Locality 2:

/path/to/neutron_star -1 -l2 --debug-hpx-log='file(hpx.0.log)'

A parcel de-serialization error occurs, however the exception reported on stderr is incorrect (the what message only says "std::exception").


Locality 1 stdout/stderr:

[20:15:25]:wash@vega:/home/wash/hpx/gcc-4.6.2-debug$ bin/neutron_star -t4 -0 -l2 --options-file ../examples/neutron_star/idstar --debug-hpx-log='file(hpx.0.log)'

<snip>

[what]: std::exception
[function]: 
[file]: 
[line]: 0
[version]: V0.8.0-trunk (AGAS: V2.1), SVN: 6927:6928M
[boost]: V1.47.0
[build-type]: debug
[date]: Feb  3 2012 19:57:18
[platform]: linux
[compiler]: GNU C++ version 4.6.2
[stdlib]: GNU libstdc++ version 20120120

Aborted

Locality 2 stdout/stderr:

[20:15:43]:wash@vega:/home/wash/hpx/gcc-4.6.2-debug$ bin/neutron_star -1 -l2 --debug-hpx-log='file(hpx.1.log)'

<snip>

[what]: std::exception
[version]: V0.8.0-trunk (AGAS: V2.1), SVN: 6927:6928M
[boost]: V1.47.0
[build-type]: debug
[date]: Feb  3 2012 19:57:18
[platform]: linux
[compiler]: GNU C++ version 4.6.2
[stdlib]: GNU libstdc++ version 20120120

Aborted

Notice the difference in exception reports on both localities.


Correct exception message in logs:

(T00000002/0000000001336450.04/----------------) P00000002/----------------.-- 20:16.29.792 [00000000000af0fb]   <error>  [PT] decode_parcel: caught std::exception: integer cannot be represented
(T00000002/0000000001336450.04/----------------) P00000002/----------------.-- 20:16.29.792 [00000000000af11b]  <always> [ERR] report_exception_and_terminate: unhandled exception: 
[what]: std::exception
[version]: V0.8.0-trunk (AGAS: V2.1), SVN: 6927:6928M
[boost]: V1.47.0
[build-type]: debug
[date]: Feb  3 2012 19:57:18
[platform]: linux
[compiler]: GNU C++ version 4.6.2
[stdlib]: GNU libstdc++ version 20120120

The logs lead me to believe that report_exception_and_terminate might be the culprit, but I haven't investigated this.

Initially I thought this was an exception serialization problem, but I believe I have patched the issue there (75b130c6a79130488d1dc0a41ea1df2a533378a7) and this bug still occurs. The logs reveal the correct what message.

brycelelbach commented 12 years ago

[comment by blelbach] [Trac time Tue Feb 28 06:32:34 2012] This turned out to be a slicing issue (e.g., we were copying std::exception implicitly with boost::current_exception).

brycelelbach commented 12 years ago

[comment by blelbach] [Trac time Tue Feb 28 13:33:51 2012] r7199