mcocdawc / chemcoord

A python module for manipulating cartesian and internal coordinates.
GNU Lesser General Public License v3.0
72 stars 19 forks source link

Zmat.sort_index() returns pandas Dataframe instead of Zmat object #63

Closed timloose closed 4 years ago

timloose commented 4 years ago

Code Sample, a copy-pastable example if possible


  import chemcoord as cc

  small = cc.Cartesian.read_xyz('small.xyz').get_zmat()

  print(type(small))
  small_sorted = small.sort_index()
  print(type(small_sorted))    

Problem description

Above code returns: <class 'chemcoord.internal_coordinates.zmat_class_main.Zmat'> <class 'pandas.core.frame.DataFrame'>

This is not ideal, as you cannot use a sorted zmat object without re-making it a zmat.

Expected Output

<class 'chemcoord.internal_coordinates.zmat_class_main.Zmat'> <class 'chemcoord.internal_coordinates.zmat_class_main.Zmat'>

Output of cc.show_versions()

INSTALLED VERSIONS ------------------ python: 3.6.9.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-957.12.2.el7.x86_64 machine: x86_64 processor: x86_64 LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 chemcoord: 2.0.4 numpy: 1.17.3 scipy: 1.3.1 pandas: 0.25.2 numba: 0.47.0 sortedcontainers: 2.1.0 sympy: 1.5.1 pytest: None pip: 19.3.1 setuptools: 41.6.0.post20191030 IPython: None sphinx: None None
mcocdawc commented 4 years ago

sort_index is really just a wrapper around the pandas functionality and manipulates the underlying DataFrame directly. You could use zmat._frame.sort_index instead to get the same result. Since the index is changed without modifying the ['b', 'a', 'd'] columns you could get up with an undefined Z-matrix, so I definetely don't want to return a Zmat instance.

If you want to sort the index and rename ['b', 'a', 'd'] columns accordingly I would use something along

zmat.change_numbering(sorted(zmat.index))

This will return a valid Zmat.

PS: With the experience I have now, I would never make this method public, because it breaks encapsulation very hard. Honestly I was just too lazy to type zmat._frame.sort_index everytime when I wanted to sort the index and did not think about the impact. Calling it has the same implications as manipulating _frame directly, you have to know, what you are doing and break the abstraction layer.

timloose commented 4 years ago

Okay, thanks for the in depth response!