trevorgokey / besmarts

A toolkit for data-driven force field design based on binary-encoded SMARTS
https://besmarts.readthedocs.io
MIT License
3 stars 2 forks source link

[JOSS REVIEW] Comments #1

Open AntObi opened 6 months ago

AntObi commented 6 months ago

Hi @trevorgokey , I'm using the issue tracker here to point out what I have found from my review openjournals/joss-reviews#6653 .

Installation

Installation instructions in the README of the repository work fine, but the installation instructions in the Readthedocs does not work. There are currently no [PyPI packages] (https://pypi.org/search/?q=besmarts) related to your packages.

Testing

I've ran the tests using pytest and they all 'pass'. However, I recommend that in line 34 of test_branch_split.py, https://github.com/trevorgokey/besmarts/blob/02a8c4c7810d5f00836747975d10939d02460aad/besmarts-core/python/tests/test_branch_split.py#L34 you rename the function test to something that does not have test in its name as this causes pytest to assume that it is a function to be tested. You can see the output of running pytest in the expandable section below.

### Test output from running `pytest -v` ``` ============================= test session starts ============================= platform win32 -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- C:\Users\AntObi\anaconda3\envs\besmarts\python.exe cachedir: .pytest_cache rootdir: C:\Users\AntObi\OneDrive - Imperial College London\Postdoc\joss_reviews\besmarts-git plugins: anyio-4.3.0 collecting ... collected 24 items besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_and PASSED [ 4%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_equal PASSED [ 8%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_in PASSED [ 12%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_not PASSED [ 16%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_not_in PASSED [ 20%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_or PASSED [ 25%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bitwise_xor PASSED [ 29%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_bool_set PASSED [ 33%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_int_set PASSED [ 37%] besmarts-core/python/tests/test_arrays.py::test_bitvec::test_bitvec_on_off PASSED [ 41%] besmarts-core/python/tests/test_arrays.py::test_intvec::test_intvec_indexing PASSED [ 45%] besmarts-core/python/tests/test_branch_split.py::test_branch_split::test_atom_branch_one PASSED [ 50%] besmarts-core/python/tests/test_branch_split.py::test_branch_split::test_atom_branch_two PASSED [ 54%] besmarts-core/python/tests/test_branch_split.py::test ERROR [ 58%] besmarts-core/python/tests/test_extender.py::test_extender::test_extender PASSED [ 62%] besmarts-core/python/tests/test_import.py::test_import::test_import_assign PASSED [ 66%] besmarts-core/python/tests/test_import.py::test_import::test_import_cluster PASSED [ 70%] besmarts-core/python/tests/test_import.py::test_import::test_import_codecs PASSED [ 75%] besmarts-core/python/tests/test_import.py::test_import::test_import_core PASSED [ 79%] besmarts-core/python/tests/test_import.py::test_import::test_import_mechanics PASSED [ 83%] besmarts-core/python/tests/test_intvec.py::test_intvec::test_intvec PASSED [ 87%] besmarts-core/python/tests/test_io.py::test_io::test_io PASSED [ 91%] besmarts-core/python/tests/test_mapping.py::test_mapping::test_mapping PASSED [ 95%] besmarts-core/python/tests/test_match.py::test_match::test_match PASSED [100%] =================================== ERRORS ==================================== ___________________________ ERROR at setup of test ____________________________ file C:\Users\AntObi\OneDrive - Imperial College London\Postdoc\joss_reviews\besmarts-git\besmarts-core\python\tests\test_branch_split.py, line 34 def test(self, smiles, branches, depth): E fixture 'self' not found > available fixtures: anyio_backend, anyio_backend_name, anyio_backend_options, cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory > use 'pytest --fixtures [testpath]' for help on them. C:\Users\AntObi\OneDrive - Imperial College London\Postdoc\joss_reviews\besmarts-git\besmarts-core\python\tests\test_branch_split.py:34 =========================== short test summary info =========================== ERROR besmarts-core/python/tests/test_branch_split.py::test ========================= 23 passed, 1 error in 1.10s ========================= ```

Examples

besmarts-core/python/examples/

My configuration for running these examples was as follows:

File Runs ? Notes
bond_union.py Yes propane.bg isn't on the joss-manuscript branch, but it is on the main branch.
forcefield_fit.py No Requires installation of besmarts-scipy. Fails*
matching.py Yes
smarts_cluster_bond_lengths.py Yes
smarts_split.py Yes
smiles_io.py Yes

* forcefield_fit.py and besmarts-scipy/python/examples/optimize_positions.py fail despite besmarts-scipy being installed with the following error:

Traceback (most recent call last):
  File "/home/anthony/besmarts-git/besmarts-core/python/examples/forcefield_fit.py", line 27, in <module>
    from besmarts.mechanics import optimizers_scipy
ImportError: cannot import name 'optimizers_scipy' from 'besmarts.mechanics' (unknown location)

Usage - RTD

In the documentation I have found areas in which the examples do not work.

Graphs

     print(indices)
 for indices in graphs.graph_impropers(g):
    print(indices)```
* `print(geometry.bond(idx[::-1]))` fails: `TypeError: 'int' object is not subscriptable`
* The following lines do not work:

```for struct in graphs.graph_to_structure_dihedrals(g):
    print(gcd.smarts_encode(struct))

for struct in graphs.graph_to_structure_impropers(g):
    print(gcd.smarts_encode(struct))
from besmarts.codecs.codec_rdkit import graph_codec_rdkit
from besmarts.core import graphs
from besmarts.core import mapper
from besmarts.core import configs

g = 'CCC=O'
g = gcd.smiles_decode(g)

atoms = graphs.graph_to_structure_atoms(g):

min_depth = 2
max_depth = 2
hydrogen = True
config = configs.smarts_extender_config(min_depth, max_depth, hydrogen)
modified = mapper.mapper_smarts_extend(config, bonds)

domain = atoms[0]
T = mapper.structure_mapper(domain)
for atom in atoms:
    T.add(atom)
C = mapper.mapper_union(T)

Labelling

Splitting

Clustering

Datasets

Documentation

From a brief scan of the codebase, many of the classes and the functions require documentation to make it easy for an external user to make good use of the modules available in this package.

trevorgokey commented 6 months ago

Hi @AntObi, thanks for taking such a detailed look. I have since given the documentation a nearly complete overhaul, and most what didn't work or was out of date was removed. The current documentation [https://besmarts.readthedocs.io/en/latest/index.html]() should be up to date and all examples and code snippets should be working.

I'll give a broken down response below:

Installation

The PyPI instructions were outdated and based on old software that this package has superseded. I have removed this reference for now to reflect the current state of the package, and will add it again once the package is up on PyPI.

Testing

Thanks for catching this! I built the tests using the unittest module in the standard python library, which didn't find an issue with this. I made a name change though, so I hope this will make the tests pytest friendly!

Examples

Thanks for giving these a spin. I've fixed the bond_union.py example to work properly. As for the installation issue, there was a typo in the setup.py and was pointing at the wrong folder. This means installing was a success but the wrong files were installed. I've fixed this and the examples should be working.

Usage

Graphs

Import errors and other bugs have been fixed. Some code which has been indicated here (in particular mapper.structure_mapper) has not been updated/tested thoroughly and needs to be fixed. It is not essential to any critical code in the examples and therefore I have removed it from the documentation until it is better developed and tested.

Labeling

There is now an example showing how to label a molecule.

Splitting

Should be fixed and working now.

Clustering

Should be fixed and working now.

Dataset

Ah, good catch. I've since moved this functionally to a difference branch and forgot to remove this page. I plan to further work on this later since it is not essential for the main purpose of this package

AntObi commented 2 months ago

Hi @trevorgokey ,

I've taken another look at the repository and have some further issues I've spotted

Installation

Currently, the installation in a fresh environment fails for me, which I have identified the causes.

For the setup.py for besmarts-rdkit, please remove besmarts.perceptron as besmarts-rdkit/python/besmarts/perceptron does not exist. It prevents the user from installing besmarts-rdkit: https://github.com/trevorgokey/besmarts/blob/93c9ac99344ada317050f566b7b2689a455750c5/besmarts-rdkit/python/setup.py#L17-L22

As well for besmarts-scipy, please remove besmarts.core in the setup.py file. https://github.com/trevorgokey/besmarts/blob/93c9ac99344ada317050f566b7b2689a455750c5/besmarts-scipy/python/setup.py#L18-L22

Tests

Currently 22/23 tests pass. The test test_import_core in besmarts-core/python/tests/test_import.py fails with an ImportError ImportError: cannot import name 'tree_merge' from 'besmarts.core' (unknown location). I've searched the codebase and tree_merge does not exist. Please remove tree_merge from the lines below.

https://github.com/trevorgokey/besmarts/blob/93c9ac99344ada317050f566b7b2689a455750c5/besmarts-core/python/tests/test_import.py#L10-L34

Examples

besmarts-core/python/examples/

My configuration for running these examples was as follows:

In besmarts-core/python/besmarts/mechanics/fits.py the following import statement exists https://github.com/trevorgokey/besmarts/blob/93c9ac99344ada317050f566b7b2689a455750c5/besmarts-core/python/besmarts/mechanics/fits.py#L35

array_numpy does not appear to exist in besmarts.core which is causing all the examples that use from besmarts.mechanics import fits to fail for me. These examples would be examples 07, 09, and 11. As this import statement is also in besmarts-scipy/python/besmarts/mechanics/optimizers_scipy.py https://github.com/trevorgokey/besmarts/blob/93c9ac99344ada317050f566b7b2689a455750c5/besmarts-scipy/python/besmarts/mechanics/optimizers_scipy.py#L20 Example 10 also fails because of this.

Example 4 is failing for me because of this line: https://github.com/trevorgokey/besmarts/blob/93c9ac99344ada317050f566b7b2689a455750c5/besmarts-core/python/examples/04_init_bonds_angles.py#L11

perception_rdkit does not appear to exist in this codebase.

I will follow up by testing the examples in the documentation as well as elsewhere in the repository where I find them

trevorgokey commented 2 months ago

It turned out I had a lot of local files that were not commited to the repo so you got a lot of missing file issues. This should all be fixed. Thanks for your patience, persistence, and diligence here.