CederGroupHub / smol

Statistical Mechanics on Lattices
https://cedergrouphub.github.io/smol/
Other
62 stars 14 forks source link

ensemble.processor.occupancy_from_structure not reproducing expected occupancy? #111

Open juliayang opened 3 years ago

juliayang commented 3 years ago

Expected Behavior

struct1 = ensemble.processor.structure_from_occupancy(init_occu) occu2 = ensemble.processor.occupancy_from_structure(struct1) print (np.where(init_occu != occu2)) # should return (array([], dtype=int64),)

Current Behavior

Instead, many occupancies are different. init_occu is not being reproduced with occu2: print (np.where(init_occu != occu2)) # returns (array([ 2, 14, 25, ..., 3447, 3450, 3453]),)

I am using a large cell with 3456 occupancies. There are 1210 occupancies in occu2 which are different from init_occu.

Possible Solution

Is it related to the "#noqa" in the occupancy_from_structure() line? My supercell is big (864 supercell) so maybe that has something to do with this bug?

Steps to Reproduce

I can provide all my mson, MC data files if helpful. The steps I am taking to get this bug are as follows:

  1. Load the ClusterExpansion mson file, and initialize an expansion.
  2. Initialize the ensemble and sc_matrix which is the same as the one used to generate init_occu: ensemble = CanonicalEnsemble.from_cluster_expansion(expansion, sc_matrix)
  3. Load the occupancy from a MC saved run: init_occu = mc_data[T]['occupancies'][0]
  4. structure1 = ensemble.processor.structure_from_occupancy(init_occu)
  5. occu2 = ensemble.processor.occupancy_from_structure(structure1)
  6. The issue is that init_occu and occu2 are different.

Context

I would like to analyze occupancies for order parameter calculations during an MC simulation. One example is the Mn-16d occupancy which tells how spinel-like a MC structure is.

I am using the table-swap method, but I don't think using this algorithm should affect the regeneration of the occupancies from structure, and vice versa. I am mystified because I haven't encountered this bug before...

juliayang commented 3 years ago

I think the source of the bug is in the ordering of the occupancies, not in the change of the occupancy bits themselves (whew!). In other words, the following two structures properly structure-match:

struct1 = ensemble.processor.structure_from_occupancy(init_occu) struct2 = ensemble.processor.structure_from_occupancy(occu2)

sm.fit(struct1, struct2) # returns True

lbluque commented 3 years ago

Thanks for bringing this up @juliayang

I had noticed this behavior a while back, but also double checked that the structures matched correctly as you did just now, and so completely forgot about it.

I guess in theory you could argue the map from structures to/from occupancies is many-to-many so that the behavior we are currently getting is not necessarily wrong. That being said, at least for a fixed supercell (in the case of processors), it would be nice to specify a single map to/from so that the occupancies indeed do match.

I'll keep this open so I remember to look into this.

juliayang commented 3 years ago

Yes agreed, the mapping should be consistent within the same supercell matrix, although many mappings can exist. Thanks for investigating this @lbluque! No rush at all in making this change -- there are still other ways with which we can analyze the MC structures.