mcodev31 / libmsym

molecular point group symmetry lib
MIT License
73 stars 33 forks source link

__repr__ not working for SymmetryOperation #7

Open mcocdawc opened 7 years ago

mcocdawc commented 7 years ago

Hej Marcus,

I try to use your library for detecting pointgroups in my python module chemcoord. There is a bug in __repr__ of your SymmetryOperation class which can be reproduced with this code:

import chemcoord as cc

molecule = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1)
def give_elements(molecule):
    return [msym.Element(name=molecule.loc[i, 'atom'], coordinates=molecule.loc[i, ['x', 'y', 'z']])
            for i in molecule.index]
with msym.Context(elements=give_elements(molecule)) as ctx:
    ctx.find_symmetry()
    symmetry_operations = ctx.symmetry_operations

symmetry_operations[0].__class__.__name__
OUT: 'SymmetryOperation'
symmetry_operations[0]._names
OUT: ['E', 'C', 'S', 'σ', 'i']
symmetry_operations[0].type
OUT: 414166088

The command symmetry_operations[0]._names[symmetry_operations[0].type] raises now an IndexError, which leads to the bug in __repr__. I gues that some byte offset from you C-code has to be calculated into symmetry_operations[0].type.

mcodev31 commented 7 years ago

Hi

I tried to reproduce this, but failed, using the following:

virtualenv -p python3 chemcoord_libmsym
cd chemcoord_libmsym
source bin/activate
git clone https://github.com/mcocdawc/chemcoord.git
git clone https://github.com/mcodev31/libmsym.git
pip install numpy scipy pandas numba sortedcontainers sympy six
cd $VIRTUAL_ENV/libmsym
mkdir build
cd build
cmake -DMSYM_BUILD_PYTHON:BOOL=ON -DBUILD_SHARED_LIBS:BOOL=ON -DCMAKE_INSTALL_PREFIX=$VIRTUAL_ENV/libmsym ../.
make install
cd bindings/python
python setup.py install
cd $VIRTUAL_ENV/chemcoord
python setup.py install
cd $VIRTUAL_ENV
cat > bug.py<<EOF
import chemcoord as cc
import libmsym as msym
molecule = cc.Cartesian.read_xyz('chemcoord/tests/structures/MIL53_small.xyz', start_index=1)
def give_elements(molecule):
    return [msym.Element(name=molecule.loc[i, 'atom'], coordinates=molecule.loc[i, ['x', 'y', 'z']])
            for i in molecule.index]
with msym.Context(elements=give_elements(molecule)) as ctx:
    ctx.find_symmetry()
    symmetry_operations = ctx.symmetry_operations

    print(symmetry_operations[0].__class__.__name__)
    print(symmetry_operations[0]._names)
    print(symmetry_operations[0].type)
    print(ctx.point_group)
    for sop in symmetry_operations:
        print(sop)
EOF

python bug.py

results in:

SymmetryOperation
['E', 'C', 'S', 'σ', 'i']
0
C2v
libmsym.libmsym.SymmetryOperation( E, conjugacy class: 0 )
libmsym.libmsym.SymmetryOperation( C2^1 around [0.0, 0.0, 1.0], conjugacy class: 1 )
libmsym.libmsym.SymmetryOperation( σv with normal vector [1.0, 0.0, 0.0], conjugacy class: 2 )
libmsym.libmsym.SymmetryOperation( σd with normal vector [6.123233995736766e-17, 1.0, 0.0], conjugacy class: 3 )

The SymmetryOperation.type field comes from the ctype and is an enum:

        enum _msym_symmetry_operation_type {
            MSYM_SYMMETRY_OPERATION_TYPE_IDENTITY = 0,
            MSYM_SYMMETRY_OPERATION_TYPE_PROPER_ROTATION = 1,
            MSYM_SYMMETRY_OPERATION_TYPE_IMPROPER_ROTATION = 2,
            MSYM_SYMMETRY_OPERATION_TYPE_REFLECTION = 3,
            MSYM_SYMMETRY_OPERATION_TYPE_INVERSION = 4
        } type;

which, given your result, indicates uninitialised memory. generateSymmetryOperations uses calloc so that should not cause this issue (although e.g. sorting could mess that up) so more likely the memory has been reused (memory freed?).

Could you provide operating system and c compiler info? And in case you can consistently reproduce this, what are the values of:

symmetry_operations[0].order
symmetry_operations[0].power
symmetry_operations[0].orientation
symmetry_operations[0].conjugacy_class

symmetry_operations[1].type
mcocdawc commented 7 years ago

Is it possible for you to use conda for managing virtual environments? Because pip had problems to install numba in my version.

The bug appears, if one is working outside the context manager.

ENV=chemcoord_libmsym
TEST_DIR=$(pwd)/$ENV
conda create -y --name $ENV
source activate $ENV
mkdir $TEST_DIR

cd $TEST_DIR
git clone https://github.com/mcocdawc/chemcoord.git
git clone https://github.com/mcodev31/libmsym.git
conda install -y numpy scipy pandas numba sortedcontainers sympy six
cd $TEST_DIR/libmsym
mkdir build
cd build
cmake -DMSYM_BUILD_PYTHON:BOOL=ON -DBUILD_SHARED_LIBS:BOOL=ON -DCMAKE_INSTALL_PREFIX=$TEST_DIR/libmsym ../.
make install
cd bindings/python
pip install .
cd $TEST_DIR/chemcoord
pip install .
cd $TEST_DIR

cat > bug1.py<<EOF
import chemcoord as cc
import libmsym as msym
molecule = cc.Cartesian.read_xyz('chemcoord/tests/structures/MIL53_small.xyz', start_index=1)
def give_elements(molecule):
    return [msym.Element(name=molecule.loc[i, 'atom'], coordinates=molecule.loc[i, ['x', 'y', 'z']])
            for i in molecule.index]
with msym.Context(elements=give_elements(molecule)) as ctx:
    ctx.find_symmetry()
    symmetry_operations = ctx.symmetry_operations

    print(symmetry_operations[0].__class__.__name__)
    print(symmetry_operations[0]._names)
    print(symmetry_operations[0].type)
    print(ctx.point_group)
    for sop in symmetry_operations:
        print(sop)
EOF

python bug1.py

cat > bug2.py<<EOF
import chemcoord as cc
import libmsym as msym
molecule = cc.Cartesian.read_xyz('chemcoord/tests/structures/MIL53_small.xyz', start_index=1)
def give_elements(molecule):
    return [msym.Element(name=molecule.loc[i, 'atom'], coordinates=molecule.loc[i, ['x', 'y', 'z']])
            for i in molecule.index]
with msym.Context(elements=give_elements(molecule)) as ctx:
    ctx.find_symmetry()
    symmetry_operations = ctx.symmetry_operations

print(symmetry_operations[0].__class__.__name__)
print(symmetry_operations[0]._names)
print(symmetry_operations[0].type)
print(ctx.point_group)
for sop in symmetry_operations:
    print(sop)
EOF

python bug2.py

cat > bug3.py<<EOF
import chemcoord as cc
import libmsym as msym
molecule = cc.Cartesian.read_xyz('chemcoord/tests/structures/MIL53_small.xyz', start_index=1)
def give_elements(molecule):
    return [msym.Element(name=molecule.loc[i, 'atom'], coordinates=molecule.loc[i, ['x', 'y', 'z']])
            for i in molecule.index]
with msym.Context(elements=give_elements(molecule)) as ctx:
    ctx.find_symmetry()
    symmetry_operations = ctx.symmetry_operations

print(symmetry_operations[0].__class__.__name__)
print(symmetry_operations[0]._names)
print(symmetry_operations[0].type)
print(ctx.point_group)
print(symmetry_operations[0].order)
print(symmetry_operations[0].power)
print(symmetry_operations[0].orientation)
print(symmetry_operations[0].conjugacy_class)
EOF

python bug3.py
source activate
conda remove -y --name $ENV --all

The outpot of bug1.py

SymmetryOperation
['E', 'C', 'S', 'σ', 'i']
0
C2v
libmsym.libmsym.SymmetryOperation(E, conjugacy_class: 0)
libmsym.libmsym.SymmetryOperation(C2^1 around [0.0, 0.0, 1.0], conjugacy_class: 1)
libmsym.libmsym.SymmetryOperation(σv with normal vector [1.0, 0.0, 0.0], conjugacy_class: 2)
libmsym.libmsym.SymmetryOperation(σd with normal vector [6.123233995736766e-17, 1.0, 0.0], conjugacy_class: 3)

The outpot of bug2.py

SymmetryOperation
['E', 'C', 'S', 'σ', 'i']
34657136
C2v
Traceback (most recent call last):
  File "bug2.py", line 16, in <module>
    print(sop)
  File "/home/oweser/.local/lib/python3.6/site-packages/libmsym/libmsym.py", line 89, in __str__
    __name__, self.__class__.__name__, self._names[self.type],
IndexError: list index out of range

The outpot of bug3.py

SymmetryOperation
['E', 'C', 'S', 'σ', 'i']
46807744
C2v
0
54674352
0
0
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.2 LTS Release: 16.04 Codename: xenial Python 3.6.1 :: Anaconda custom (64-bit)
mcodev31 commented 7 years ago

Thanks, that helped.

I can't use conda on my current machine (need my current setup for work), but I'll set up a VM if needed. I reproduced the bug by calling gc.collect() outside the context manager (followed by reads).

Turns out simple fields of ctypes.Structure (c_int etc.) are also by reference. Since Context.__del__ destroys the C context (and frees the symmetry operation memory) the data is no longer valid outside the context manager.

This is annoying, but makes sense I guess. I'd rather avoid circular references, but perhaps I'll try weakref. Just using copy seems to work as well in most cases, but I'm not overly fond of that. You can use symmetry_operations = [copy(x) for x in ctx.symmetry_operations] as a temporary workaround though.

This affects more than just the SymmetryOperations so I can't fix it right away.

Also have some thoughts on reimplementing a version completely in python that can use some more interesting algorithms cumbersome to implement in C.

mcocdawc commented 7 years ago

Glad it helped and thanks for the workaround.

mcocdawc commented 7 years ago

Ok now another possible solution. What is your opininion about making copies, when setting the attributes of self. The following works:

    def _update_symmetry_operations(self, symmetry_operations):
        addresses = [addressof(sop) for sop in symmetry_operations]
        sops = [symmetry_operations[addresses.index(addressof(sop.contents))]
                for sop in self._sops[0:self._d]]
        self.symmetry_operations = copy.deepcopy(sops)
mcodev31 commented 7 years ago

Yeah, that's what I meant with

Just using copy seems to work as well in most cases

But it doesn't just apply to symmetry operations (basis functions, character table etc.)