rs-station / reciprocalspaceship

Tools for exploring reciprocal space
https://rs-station.github.io/reciprocalspaceship/
MIT License
28 stars 12 forks source link

create empty DataSet from another one? #39

Closed DHekstra closed 3 years ago

DHekstra commented 3 years ago

is there a way to do this: https://stackoverflow.com/questions/18176933/create-an-empty-data-frame-with-index-from-another-data-frame for rs dataframes?

JBGreisman commented 3 years ago

There are a few ways that this can be accomplished -- let's assume we have a rs.DataSet that looks like the following:

> ds
            FMODEL   PHIFMODEL
H  K  L                       
45 12 14  9.349418  -48.853035
37 10 13 17.603436 -126.121925
39 2  33 24.622107   -68.23137
14 1  37 214.67409  -46.876408
35 17 23  34.46986   86.572845

A simple solution would be this:

> ds2 = ds[[]].copy()
> ds2
Empty DataSet
Columns: []
Index: [(45, 12, 14), (37, 10, 13), (39, 2, 33), (14, 1, 37), (35, 17, 23)]

It is also possible to directly initialize one rs.DataSet with another:

> ds2 = rs.DataSet(ds[[]])
> ds2
Empty DataSet
Columns: []
Index: [(45, 12, 14), (37, 10, 13), (39, 2, 33), (14, 1, 37), (35, 17, 23)]

rs.DataSet objects are subclasses of pd.DataFrame, so typically any pandas solution should work in reciprocalspaceship. In my mind, this is a significant advantage to the design choice in rs, because it means that you can typically just follow the advice of stackoverflow results regarding pandas.

However, I'd like to provide one cautionary example -- if you were to do it using the currently accepted answer, you will lose the attributes associated with rs.DataSet (i.e. ds.cell and ds.spacegroup):

> ds2 = rs.DataSet(index=ds.index)
> ds2
Empty DataSet
Columns: []
Index: [(45, 12, 14), (37, 10, 13), (39, 2, 33), (14, 1, 37), (35, 17, 23)]
> ds.spacegroup
<gemmi.SpaceGroup("P 31 2 1")>
> print(ds2.spacegroup)
None

This is because the ds2 is only initialized with an index, which doesn't carry with it all the "metadata" of the rs.DataSet object.