OpenBioSim / sire

Sire Molecular Simulations Framework
https://sire.openbiosim.org
GNU General Public License v3.0
41 stars 11 forks source link

Migrate rdkit-dev to librdkit-dev #217

Closed skearnes closed 3 months ago

skearnes commented 3 months ago

Hi, this is a friendly notification that the rdkit-dev conda-forge package has been replaced by librdkit-dev as of https://github.com/conda-forge/rdkit-feedstock/pull/158; please update your dependencies.

@lohedges thanks for inspiration to do this via github code search :)

lohedges commented 3 months ago

Many thanks. Currently working on this in #216.

lohedges commented 3 months ago

The switch to librdkit-dev seems to have worked for all platforms other than Windows, which ironically was the only one that still worked with rdkit-dev. We're currently getting runtime errors using a matched version of rdkit to the librdkit-dev that was used at build time. Unfortunately I can't see the exact issue from the CI logs, but will try to investigate further.

build:

2024-07-25T13:10:58.5407636Z     librdkit:                      2024.03.5-h4c5bbc8_1           conda-forge
2024-07-25T13:10:58.5408828Z     librdkit-dev:                  2024.03.5-h36c15f3_1           conda-forge
2024-07-25T13:16:33.7974170Z     rdkit:                         2024.03.5-py312h9d9823f_1      conda-forge

test:

2024-07-25T15:07:44.9229795Z     rdkit:                     2024.03.5-py312h9d9823f_1     conda-forge
skearnes commented 3 months ago

Thanks; I'll reopen the issue until we get that resolved.

skearnes commented 3 months ago

@lohedges could you send me a link to the CI log? If there's a specific header I can look for that would be helpful.

lohedges commented 3 months ago

The last working Windows Python 3.12 log is here. At build time it used:

2024-06-29T17:08:52.9966320Z     rdkit:                     2024.03.3-py312h7e6e52d_0    conda-forge
2024-06-29T17:08:52.9966882Z     rdkit-dev:                 2024.03.3-h36c15f3_0         conda-forge

The library and headers were correctly detected:

2024-06-29T17:38:32.1135002Z -- Looking for RDKit in C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1719680378610/_h_env/incl>
2024-06-29T17:38:32.1139376Z -- Found RDKit libraries at C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1719680378610/_h_env/>
2024-06-29T17:38:32.1150448Z -- Found RDKit library files at C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1719680378610/_h_>
2024-06-29T17:38:32.1167780Z -- Found RDKit: C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1719680378610/_build_env/Library/>
2024-06-29T17:38:32.1169742Z -- Compiling SireRDKit converter

For the tests it pulled in...

2024-06-29T18:51:06.3814956Z     rdkit:                     2024.03.3-py312h7e6e52d_0    conda-forge

and all the RDKit related tests passed, e.g.:

2024-06-29T18:53:43.7385564Z tests/convert/test_rdkit.py::test_rdkit[C1CCCCC1] PASSED                 [ 15%]
2024-06-29T18:53:43.7386523Z tests/convert/test_rdkit.py::test_rdkit[C] PASSED                        [ 15%]
2024-06-29T18:53:43.7387413Z tests/convert/test_rdkit.py::test_rdkit[OCC(O)C(O)C(O)C(O)CO] PASSED     [ 16%]
2024-06-29T18:53:43.7388302Z tests/convert/test_rdkit.py::test_rdkit[C[C@H](N)C(=O)O] PASSED          [ 16%]
2024-06-29T18:53:43.7389217Z tests/convert/test_rdkit.py::test_rdkit[C[C@@H](N)C(=O)O] PASSED         [ 16%]
2024-06-29T18:53:43.7390105Z tests/convert/test_rdkit.py::test_chirality PASSED                       [ 16%]
2024-06-29T18:53:43.7391014Z tests/convert/test_rdkit.py::test_hybridization PASSED                   [ 17%]
2024-06-29T18:53:43.7391915Z tests/convert/test_rdkit.py::test_hybridization2 PASSED                  [ 17%]
2024-06-29T18:53:43.7392818Z tests/convert/test_rdkit.py::test_rdkit_returns_null PASSED              [ 17%]
2024-06-29T18:53:43.7393688Z tests/convert/test_rdkit.py::test_rdkit_infer_bonds PASSED               [ 18%]
2024-06-29T18:53:43.7394600Z tests/convert/test_rdkit.py::test_rdkit_preserve_info PASSED             [ 18%]

The log from the most recent CI run, which now uses librdkit-dev for the build, is here. For the build this used:

2024-07-25T14:07:44.3219568Z     librdkit:                      2024.03.5-h4c5bbc8_1           conda-forge
2024-07-25T14:07:44.3220169Z     librdkit-dev:                  2024.03.5-h36c15f3_1           conda-forge
2024-07-25T14:11:09.9719140Z     rdkit:                         2024.03.5-py312h9d9823f_1      conda-forge

Again, the library and headers were detected just fine and there was no issue compiling the Sire-RDKit converter:

2024-07-25T14:44:25.9173370Z -- Looking for RDKit in C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/include/rdkit, C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/include/rdkit and C:/Users/runneradmin/>
2024-07-25T14:44:25.9177130Z -- Found RDKit libraries at C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/lib
2024-07-25T14:44:25.9186799Z -- Found RDKit library files at C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/lib/RDKitGraphMol.lib;C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/lib/RDKitRDGener>
2024-07-25T14:44:25.9195680Z -- Found RDKit: C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/include/rdkit
2024-07-25T14:44:25.9196997Z -- Compiling SireRDKit converter
2024-07-25T14:44:25.9198332Z -- RDKIT_INCLUDE_DIR: C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/include/rdkit
2024-07-25T14:44:25.9207863Z -- RDKIT_LIBRARIES:   C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/lib/RDKitGraphMol.lib;C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1721916105352/_h_env/Library/lib/RDKitRDGeneral.lib;C:/>

For the tests it pulled in...

2024-07-25T15:59:15.0885132Z     rdkit:                     2024.03.5-py312h9d9823f_1     conda-forge

and every RDKit related test failed:

2024-07-25T16:01:43.3692288Z tests/convert/test_rdkit.py::test_rdkit[C1CCCCC1] FAILED                 [ 15%]
2024-07-25T16:01:43.3693071Z tests/convert/test_rdkit.py::test_rdkit[C] FAILED                        [ 15%]
2024-07-25T16:01:43.3693857Z tests/convert/test_rdkit.py::test_rdkit[OCC(O)C(O)C(O)C(O)CO] FAILED     [ 16%]
2024-07-25T16:01:43.3694704Z tests/convert/test_rdkit.py::test_rdkit[C[C@H](N)C(=O)O] FAILED          [ 16%]
2024-07-25T16:01:43.3695489Z tests/convert/test_rdkit.py::test_rdkit[C[C@@H](N)C(=O)O] FAILED         [ 16%]
2024-07-25T16:01:43.3696270Z tests/convert/test_rdkit.py::test_chirality FAILED                       [ 16%]
2024-07-25T16:01:43.3697111Z tests/convert/test_rdkit.py::test_hybridization FAILED                   [ 17%]
2024-07-25T16:01:43.3697906Z tests/convert/test_rdkit.py::test_hybridization2 FAILED                  [ 17%]
2024-07-25T16:01:43.3698730Z tests/convert/test_rdkit.py::test_rdkit_returns_null FAILED              [ 17%]
2024-07-25T16:01:43.3699528Z tests/convert/test_rdkit.py::test_rdkit_infer_bonds FAILED               [ 18%]
2024-07-25T16:01:43.3700331Z tests/convert/test_rdkit.py::test_rdkit_preserve_info FAILED             [ 18%]
2024-07-25T16:01:43.3701115Z tests/convert/test_smarts.py::test_smarts FAILED                         [ 18%]

These all failed due to the the following:

2024-07-25T16:01:54.7649225Z E       TypeError: No to_python (by-value) converter found for C++ type: class boost::shared_ptr<class RDKit::ROMol>

I'll check to see if this is an issue with the Python wrappers at our end. These are registered as follows:

        register_list<QList<RDKit::ROMOL_SPTR>>();

        // make sure we have at least one registration of the
        // smart pointer wrapper to ROMOL_SPTR
        const auto info = boost::python::type_id<RDKit::ROMOL_SPTR>();
        const auto *reg = bp::converter::registry::query(info);
        if (reg == NULL)
        {
                bp::register_ptr_to_python<RDKit::ROMOL_SPTR>();
        }

(This hasn't changed for 9 months and seemingly worked okay with the old approach. Could potentially be a change to Boost::Python?)

lohedges commented 3 months ago

For reference, the code for the Sire-RDKit converter can be found here.

skearnes commented 3 months ago

Thanks; created a draft PR so I can test some things.

skearnes commented 3 months ago

@lohedges looks like I need you to approve the actions run?

lohedges commented 3 months ago

Should we not be using librdkit-dev, as I have done in #216 onwards? I understand that the new alias should work, but it would be to understand why using librdkit-dev fails (at runtime) on Windows.

Happy to let the CI run regardless as it will be interesting to see if the alias works.

lohedges commented 3 months ago

Just to comment on the CI failures. For Linux and macOS the following happens:

-- Looking for RDKit in $PREFIX/include/rdkit, $PREFIX/Library/include/rdkit and $BUILD_PREFIX/Library/include/rdkit
-- Found RDKit libraries at $PREFIX/lib
-- Found RDKit library files at $PREFIX/lib/libRDKitGraphMol.so;$PREFIX/lib/libRDKitRDGeneral.so;$PREFIX/lib/libRDKitSmilesParse.so;$PREFIX/lib/libRDKitRDGeometryLib.so;$PREFIX/lib/libRDKitDistGeomHelpers.so;$PREFIX/lib/libRDKitForceField.so;$PREFIX/lib/libRDKitForceFieldHelpers.so;$PREFIX/lib/libRDKitSubstructMatch.so
-- Could NOT find RDKit (missing: RDKIT_INCLUDE_DIR) 

As before, the library files are found, but the headers aren't. Looks like it's still not pulling in librdkit-dev, e.g. from one of the Linux logs:

2024-07-29T21:04:44.7242936Z     librdkit:                         2024.03.5-h79cfef2_2                   conda-forge   
2024-07-29T21:04:44.7312365Z     rdkit:                            2024.03.5-py310h57e35d3_2              conda-forge         

The pass for the Windows build is spurious since RDKit also wasn't found. This time, both the libraries and the headers:

2024-07-29T21:47:37.5436968Z -- Looking for RDKit in C:/Users/runneradmin/miniconda3/envs/sire_build/conda-bld/sire_1722287093358/_h_env/incl>
2024-07-29T21:47:37.5439747Z -- Could NOT find RDKit (missing: RDKIT_LIBRARIES)

In this case it skips building the SireRDKit converter, so the RDKit related tests are also skipped, e.g.:

2024-07-29T23:00:45.9827124Z tests/convert/test_rdkit.py::test_rdkit[C1CCCCC1] SKIPPED (rdkit support
2024-07-29T23:00:45.9827584Z is not available)     

(The tests are still run on Linux and macOS for some reason.)

skearnes commented 3 months ago

That's really bizarre that librdkit-dev isn't being pulled in. It also looks like the rdkit-dev version is older than the alias package (the build is from before the librdkit PR was merged). I don't know why the latest build isn't being used. I updated my PR to pin to an earlier version to try to unblock you while we figure this out.

skearnes commented 3 months ago

This is really weird:

2024-07-29T21:06:19.2103990Z     rdkit:                     2024.03.5-py311h8a4e316_2  conda-forge
2024-07-29T21:06:19.2104310Z     rdkit-dev:                 2024.03.5-h0dfc388_0       conda-forge

It's pulling in packages from two different build numbers (2 and 0)