IUPAC-InChI / InChI

Main InChI repository
MIT License
60 stars 7 forks source link

v1.07 no longer seems to work in multiple threads #39

Open greglandrum opened 1 month ago

greglandrum commented 1 month ago

I just tried to get v1.07 working within the RDKit and immediately ran into the problem that our tests which attempt to use the InChI code from multiple threads simultaneously now fail.

The failure mode is that I start getting bad values (empty InChIs) or a seg fault at some point during execution. This does not happen if I run the same test with the number of threads set to 1, and it still happens even if I call the RDKit's "MolBlockToInChI" function, which is a thin wrapper around MakeINCHIFromMolfileText(), so this doesn't seem to be something in the way we're using the InChI library. But, then again, it's multi-threaded stuff, so who knows?

Is there something extra that now needs to be done from client code in order to safely use the code from mulitple threads or is this a regression in the InChI code itself?

In case you're curious, the specific test that's failing is this one: https://github.com/rdkit/rdkit/blob/master/External/INCHI-API/test.cpp#L65

JanCBrammer commented 1 month ago

@greglandrum, thanks for the issue. Which InChI version was used to run those tests before / that didn't fail the multithreading? Looking at https://github.com/rdkit/rdkit/blob/3d972a9e2f9ef097a36c4cf09334e898c69f16c1/External/INCHI-API/download-inchi.sh#L55 it seems that v1.05 is the last known good version?

greglandrum commented 1 month ago

@JanCBrammer Yes, we use v1.05 now.

JanCBrammer commented 1 month ago

Any chance you could try with v1.06? I.e., we could start by bisecting versions before bisecting the commits from v1.06 to v1.07.

greglandrum commented 1 month ago

Any chance you could try with v1.06? I.e., we could start by bisecting versions before bisecting the commits from v1.06 to v1.07.

Sorry I mis-spoke: we are actually using v1.06, not v1.05

JanCBrammer commented 1 month ago

we are actually using v1.06, not v1.05

Nice, then we'll bisect the commits between v1.06 and v1.07 using https://github.com/rdkit/rdkit/blob/3d972a9e2f9ef097a36c4cf09334e898c69f16c1/External/INCHI-API/test.cpp#L65. Keeping you in the loop.

JanCBrammer commented 1 month ago

@greglandrum, I could replicate the issue as follows. When running RDBASE=$PWD/.. PYTHONPATH=$RDBASE LD_LIBRARY_PATH=$RDBASE/lib:$LD_LIBRARY_PATH ctest -R testInchi with v1.07 (git checkout v1.07.0) I get (across multiple runs)

Test project /workspaces/rdkit/build
    Start 1: testInchi
1/1 Test #1: testInchi ........................Subprocess aborted***Exception:   4.94 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   4.95 sec

The following tests FAILED:
      1 - testInchi (Subprocess aborted)
Errors while running CTest

or

Test project /workspaces/rdkit/build
    Start 1: testInchi
1/1 Test #1: testInchi ........................***Exception: SegFault  0.07 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.08 sec

The following tests FAILED:
      1 - testInchi (SEGFAULT)
Errors while running CTest

When running the tests with v1.06 (git checkout v1.06) I get

Test project /workspaces/rdkit/build
    Start 1: testInchi
1/1 Test #1: testInchi ........................   Passed    5.06 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   5.07 sec

Therefore, we know that the last known "good" commit G is the one that's tagged with v1.06 and we know that commit B tagged with v1.07.0 is "bad". Now, I need to find the first commit B' between G and B that's bad. B' should contain the change that breaks the tests.

Following this rationale, using git bisect, I identified B' to be https://github.com/IUPAC-InChI/InChI/commit/66f42acbe9e29270a06e1c34307d12527c4ba399. I.e., running the tests after git checkout 66f42acbe9e29270a06e1c34307d12527c4ba399 results in

Test project /workspaces/rdkit/build
    Start 1: testInchi
1/1 Test #1: testInchi ........................Subprocess aborted***Exception:   0.10 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.11 sec

The following tests FAILED:
          1 - testInchi (Subprocess aborted)
Errors while running CTest

@djb-rwth, could you verify this?

greglandrum commented 1 month ago

@JanCBrammer thanks for tracking the commit down. It looks like that commit introduces some global variables. I'm going to guess those are what are causing the problem with thread safety.

djb-rwth commented 1 month ago

Hi @greglandrum, Sorry for a slightly belated response.

@JanCBrammer's findings might be a good starting point in resolving this issue; however many parts of the code which belong to these past commits have been deleted and/or altered in the meantime.

When I first read this post, I thought that InChI multi-threading support in RDKit has been provided through libinchi API which had been compiled using Intel® oneAPI Threading Building Blocks, as a basic framework for OneTBB has been added to InChI since v.1.06; obviously, this is not the case according files in rdkit/External/INCHI-API/ folder as they use CMake to compile InChI from source.

Your assumption regarding the compromised thread safety related to use global variables makes sense, but I would still need to set an environment which can compile test.cpp. Please let me know if I should compile RDKit from source, and if the procedure is similar to compiling OpenCV from source.

This is essential for rewriting some parts of the InChI code, so they comply to reentrant and/or threadsafe rules.

Implementing parallel programming into InChI would be much better (and a personal aspiration), but it looks as if some other directions in development have bigger priority at the moment.

JanCBrammer commented 1 month ago

set an environment which can compile test.cpp. Please let me know if I should compile RDKit from source, and if the procedure is similar to compiling OpenCV from source.

@djb-rwth, see https://www.rdkit.org/docs/Install.html regarding RDKit compilation.

greglandrum commented 1 month ago

Actually, this version is a bit more compact and, I think, up to date: https://greglandrum.github.io/rdkit-blog/posts/2023-03-17-setting-up-a-cxx-dev-env2.html

greglandrum commented 1 month ago

When I first read this post, I thought that InChI multi-threading support in RDKit has been provided through libinchi API which had been compiled using Intel® oneAPI Threading Building Blocks, as a basic framework for OneTBB has been added to InChI since v.1.06; obviously, this is not the case according files in rdkit/External/INCHI-API/ folder as they use CMake to compile InChI from source.

To clarify my perspective: The RDKit just needs to be able to call into the InChI code multiple times from different threads; this used to be possible. I don't need the InChI library to provide this and definitely have no interest in adding a dependency on Intel's TBB

Your assumption regarding the compromised thread safety related to use global variables makes sense, but I would still need to set an environment which can compile test.cpp. Please let me know if I should compile RDKit from source, and if the procedure is similar to compiling OpenCV from source.

This is essential for rewriting some parts of the InChI code, so they comply to reentrant and/or threadsafe rules.

Implementing parallel programming into InChI would be much better (and a personal aspiration), but it looks as if some other directions in development have bigger priority at the moment.

My two cents: I don't see much value in having a parallel version of the InChI code itself. I think the highest value the InChI team can provide is improvements to the efficiency and robustness of the existing library

djb-rwth commented 1 month ago

Actually, this version is a bit more compact and, I think, up to date: https://greglandrum.github.io/rdkit-blog/posts/2023-03-17-setting-up-a-cxx-dev-env2.html

Hi @greglandrum, Thanks for the link. My current priority is to get test.cpp compiled in Visual Studio 2022 on Windows so I could try to resolve the multi-threaded issue. Please let me know if the above mentioned installation is able to provide this.

Two notes:

greglandrum commented 1 month ago

Hi @djb-rwth here's an extremely minimal set of instructions for getting a build environment working for the InChI test on Windows. I'm assuming that you have some type of conda install (if not, install and configure miniforge here: https://github.com/conda-forge/miniforge)

First setup and activate a conda environment with the pre-requisites you need. I run these commands in a git bash shell:

conda create -n py310_rdkit_build -c conda-forge python=3.10 boost-cpp boost libboost libboost-devel cmake
conda activate py310_rdkit_build

Do the following in that same shell:

I just tried this on my laptop and it worked... hopefully it will also work for you.

djb-rwth commented 1 month ago

Hi @djb-rwth here's an extremely minimal set of instructions for getting a build environment working for the InChI test on Windows.

Thanks so much for this @greglandrum. I will try to do this ASAP.

giallu commented 4 weeks ago
  • cd to the build directory
  • cmake -DRDK_BUILD_PYTHON_WRAPPERS=OFF -DRDK_BUILD_INCHI_SUPPORT=ON -DRDK_BUILD_FREETYPE_SUPPORT=OFF ../
  • That should run without errors (though it will spit out some warnings)
  • Now open the solution file it produced in Visual Studio and build the testInchi target (there's probably some way to do this from the command line, but I don't know what it is for Windows). you probably want to switch the build configuration to "Release" first

I believe that would be cmake --build --target testinchi

greglandrum commented 4 weeks ago

I believe that would be cmake --build --target testinchi

I was not able to make that work.

giallu commented 4 weeks ago

I was not able to make that work.

You are right, I just realized the Visual Studio generator is a multi-config one so --target can't work there.

janholstjensen commented 1 day ago

I can reproduce the issue with test code that I have setup to make the Reaction InChI C API thread-safe. InChI 1.0.6.0 works, while 1.0.7.0 fails.

In some cases the InChI library starts returning inchi_Ret_ERROR and in other cases I see a segmentation fault. Feel free to reach out to me for alternative test code and additional testing of InChI concurrency.