epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
315 stars 104 forks source link

Terminating with uncaught exception of type std::bad_cast: std::bad_cast #190

Open khyurri opened 4 years ago

khyurri commented 4 years ago

OS: MacOS Indigo Version: 1.4.0-beta.r0-ga8773211 mac10.7

How to reproduce:

Using rxnfiles/legio-amides.rdf from indigo tests:

for mol in session.iterateRDFile("rxnfiles/legio-amides.rdf"):
    print(mol)
    exact_matcher = bingo_conn.searchExact(mol, "")

Result:

2020-09-29 14:54:46 [INFO] 0 working on rxnfiles/legio-amides.rdf
<indigo.IndigoObject object at 0x7feb30178f90>
<indigo.IndigoObject object at 0x7feb301789d0>
<indigo.IndigoObject object at 0x7feb30178f90>
<indigo.IndigoObject object at 0x7feb301789d0>
<indigo.IndigoObject object at 0x7feb30178f90>
<indigo.IndigoObject object at 0x7feb301789d0>
<indigo.IndigoObject object at 0x7feb30178f90>
<indigo.IndigoObject object at 0x7feb301789d0>
<indigo.IndigoObject object at 0x7feb30178f90>
<indigo.IndigoObject object at 0x7feb3029a150>
libc++abi.dylib: terminating with uncaught exception of type std::bad_cast: std::bad_cast
khyurri commented 4 years ago

Same exception with files:

twall commented 2 years ago

I see a similar error, although I can't provide a specific input to trigger the error. When running using python multiprocessing to create several nosql bingo dbs in parallel, I get the following:.

terminate called after throwing an instance of 'std::out_of_range'
what():  array::at: __n (which is 120) >= _Nm (which is 119)

Which results in the multiprocessing parent process hanging waiting for the terminated sub.

This happens when using ecfp* similarity, but not with sim or chem.

Fundamentally, the shared library needs to do a better job of catching all exceptions and translating to the python layer.

mkviatkovskii commented 2 years ago

Dear @twall, Thanks for reporting this. Does it happen on latest version of Indigo (1.6.1)? We've reworked parallel processing code, and the original issue does not reproduce anymore.

As for exception handling, we are working on it. It's not trivial to provide proper C++ stack traces on all operating systems. In any case it should not terminate on non-catched exception, so thanks for mentioning that.

twall commented 2 years ago

The uncaught exception is still present on python indigo 1.6.1.

I'll see if I can narrow down the specific input, but it's a little tricky b/c there are perhaps 2-3 failures out of several hundred million.

On Fri, Jan 21, 2022 at 9:08 AM Mikhail Kviatkovskii < @.***> wrote:

Dear @twall https://github.com/twall, Thanks for reporting this. Does it happen on latest version of Indigo (1.6.1)? We've reworked parallel processing code, and the original issue does not reproduce anymore.

As for exception handling, we are working on it. It's not trivial to provide proper C++ stack traces on all operating systems. In any case it should not terminate on non-catched exception, so thanks for mentioning that.

— Reply to this email directly, view it on GitHub https://github.com/epam/Indigo/issues/190#issuecomment-1018535500, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYZLLU2T5VL635W4W55ATUXFSGLANCNFSM4R5WLQJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

twall commented 2 years ago

It doesn't seem to be any particular input, but rather an interaction with python's multiprocessing. Still trying to track it down.

On Fri, Jan 21, 2022 at 10:53 AM Timothy Wall @.***> wrote:

The uncaught exception is still present on python indigo 1.6.1.

I'll see if I can narrow down the specific input, but it's a little tricky b/c there are perhaps 2-3 failures out of several hundred million.

On Fri, Jan 21, 2022 at 9:08 AM Mikhail Kviatkovskii < @.***> wrote:

Dear @twall https://github.com/twall, Thanks for reporting this. Does it happen on latest version of Indigo (1.6.1)? We've reworked parallel processing code, and the original issue does not reproduce anymore.

As for exception handling, we are working on it. It's not trivial to provide proper C++ stack traces on all operating systems. In any case it should not terminate on non-catched exception, so thanks for mentioning that.

— Reply to this email directly, view it on GitHub https://github.com/epam/Indigo/issues/190#issuecomment-1018535500, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYZLLU2T5VL635W4W55ATUXFSGLANCNFSM4R5WLQJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

twall commented 2 years ago

This happens with as few as one process in the multiprocessing pool.

terminate called after throwing an instance of 'std::out_of_range'
  what():  array::at: __n (which is 120) >= _Nm (which is 119)
mkviatkovskii commented 2 years ago

Dear @twall, Thanks for confirming the issue still exists. I'll try to reproduce it using multiprocessing pool.

twall commented 2 years ago

Either of the following encodings will trigger the error (but only with similarity-type of ecfp2/4/6/8):

smiles: *CC1CCCC1
inchi: InChI=1S/C6H11.Cn/c1-6-4-2-3-5-6;/h6H,1-5H2;

Molecule loaded with indigo.loadMolecule with the following indigo options:

    "ignore-stereochemistry-errors": True,
    "standardize-charges": True,
    "standardize-keep-largest": True,
    "ignore-closing-bond-direction-mismatch": True,
    "ignore-bad-valence": True,
    "standardize-stereo": True,
    "standardize-neutralize-zwitterions": True,
    "standardize-clear-unusual-valences": True,
    "similarity-type": "ecfp2",

multiprocessing isn't the issue, it's just that the crash causes the Pool to wait forever for results from those crashed processes (which is a bug in multiprocessing).