CederGroupHub / smol

Statistical Mechanics on Lattices
https://cedergrouphub.github.io/smol/
Other
62 stars 14 forks source link

[Bug]: _gen_composition_ordered_occu in /smol/capp/generate/random.py composition check #471

Open mstapelberg opened 4 months ago

mstapelberg commented 4 months ago

Email (Optional)

myless@mit.edu

Version

0.5.3

Which OS(es) are you using?

What happened?

  1. I am trying to use generate_random_ordered_occupancy to create an initial structure for my GC MCMC runs
  2. I've added some print statements to debug what's going on and it seems like there is a slight issue when my compositions do not exactly match up?
  3. Is there a way to fix this by perhaps passing a balance element to subtract from to solve this occupancy issue when a tolerance of more than 1 would be needed to fix this?

I can explain my workflow a bit more if that will help.

Thanks! Myles

Code snippet

work = load_work(expansion_path)
    expansion = work['ClusterExpansion']
    # Create the ensemble
    # This specifies the size of the MC simulation domain.
    # this gives a 64 site unit cell
    sc_matrix = np.array([
        [4, 0, 0],
        [0, 4, 0],
        [0, 0, 4]
    ])
    # this convenience method will take care of creating the appropriate
    # processor for the given cluster expansion.
    #os.environ['OMP_NUM_THREADS'] = '4'
    ensemble = Ensemble.from_cluster_expansion(expansion, sc_matrix)
    ensemble.processor.num_threads_full = 5

    # In a real scenario you may want a much larger processor.size
    # An MC step is O(1) with the processor.size, meaning it runs at
    # the same speed regardless of the size. However, larger sizes
    # will need many more steps to reach equilibrium in an MC simulation.
    print(f'The supercell size for the processor is {ensemble.processor.size} prims.')
    print(f'The ensemble has a total of {ensemble.num_sites} sites.')
    print(f'The active sublattices are:')
    for sublattice in ensemble.sublattices:
        print(sublattice)

    # here we also set the temperature to our operating temperature, in V-Cr-Ti this should be around 900K 
    T_sample = 973.15
    sampler = Sampler.from_ensemble(ensemble, temperature=T_sample)
    print(f"Sampling information: {sampler.samples.metadata}")
    compositions = [sublattice.composition for sublattice in ensemble.sublattices]

    print(dir(sublattice))
    compositions = [sublattice.composition for sublattice in ensemble.sublattices]
    init_occu = generate_random_ordered_occupancy(processor= ensemble.processor,
                                                composition=compositions,
                                                tol = 0.99,
                                                rng=42)

Log output

The supercell size for the processor is 64 prims.
The ensemble has a total of 64 sites.
The active sublattices are:
Sublattice(site_space=Cr0.0672 Ti0.0428 Zr0.0248 W0.0289 V0.8363 , sites=array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]), active_sites=array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]), encoding=array([0, 1, 2, 3, 4]))
Sampling information: Metadata(cls_name='SampleContainer', kernels=[Metadata(seed=199382794672560331959954031836088518048, step=Metadata(sublattices=[(Element Zr, Element Ti, Element V, Element Cr, Element W)], sublattice_probabilities=array([1.]), cls_name='Swap'), cls_name='Metropolis')])
['REDIRECT', '__annotations__', '__class__', '__dataclass_fields__', '__dataclass_params__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get_pydantic_core_schema__', '__get_pydantic_json_schema__', '__get_validators__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__match_args__', '__modify_schema__', '__module__', '__ne__', '__new__', '__post_init__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_generic_json_schema', '_ipython_display_', '_repr_mimebundle_', '_validate_monty', 'active_sites', 'as_dict', 'composition', 'display_json', 'encoding', 'from_dict', 'is_active', 'reset_restricted_sites', 'restrict_sites', 'restricted_sites', 'site_space', 'sites', 'species', 'split_by_species', 'to_json', 'to_plotly_json', 'unsafe_hash', 'validate_monty_v1', 'validate_monty_v2']

{
    "name": "ValueError",
    "message": "composition = Cr0.0672 Ti0.0428 Zr0.0248 W0.0289 V0.8363 and is not compatible with supercell size. 65 is not greater than 64.99.",
    "stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 37
     35 print(dir(sublattice))
     36 compositions = [sublattice.composition for sublattice in ensemble.sublattices]
---> 37 init_occu = generate_random_ordered_occupancy(processor= ensemble.processor,
     38                                             composition=compositions,
     39                                             tol = 0.99,
     40                                             rng=42)
     42 print(f\"The disordered structure has composition: {ensemble.processor.structure.composition}\")
     43 print(f\"The initial occupancy has composition: {ensemble.processor.structure_from_occupancy(init_occu).composition}\")

File /opt/homebrew/Caskroom/miniforge/base/envs/smol/lib/python3.10/site-packages/smol/capp/generate/random.py:55, in generate_random_ordered_occupancy(processor, composition, charge_neutral, tol, encoded, rng, **kwargs)
     53         occu = _gen_unconstrained_ordered_occu(sublattices, rng=rng, **kwargs)
     54 else:
---> 55     occu = _gen_composition_ordered_occu(
     56         sublattices, composition, tol, rng=rng, **kwargs
     57     )
     59 if not encoded:
     60     occu = processor.decode_occupancy(occu)

File /opt/homebrew/Caskroom/miniforge/base/envs/smol/lib/python3.10/site-packages/smol/capp/generate/random.py:164, in _gen_composition_ordered_occu(sublattices, composition, tol, rng)
    146 \"\"\"Generate a random occupancy satisfying a given composition.
    147 
    148 Args:
   (...)
    161     ndarray: encoded occupancy
    162 \"\"\"
    163 rng = np.random.default_rng(rng)
--> 164 compositions = _composition_compatiblity(sublattices, composition, tol, rng=rng)
    165 occu = np.zeros(sum(len(sl.sites) for sl in sublattices), dtype=int)
    167 for composition, sublattice in zip(compositions, sublattices):
    168     # create a dummy site space to account for vacancies

File /opt/homebrew/Caskroom/miniforge/base/envs/smol/lib/python3.10/site-packages/smol/capp/generate/random.py:229, in _composition_compatiblity(sublattices, composition, tol, rng)
    226         total += round(num_sites)
    228     if total > len(sublattice.sites) + tol:
--> 229         raise ValueError(f\"composition = {composition} and is not compatible with supercell size. {total} is not greater than {len(sublattice.sites) + tol}.\")
    231 return compositions

ValueError: composition = Cr0.0672 Ti0.0428 Zr0.0248 W0.0289 V0.8363 and is not compatible with supercell size. 65 is not greater than 64.99."
}

Code of Conduct

kamronald commented 4 months ago

Hi @mstapelberg, the current version of generate_random_ordered_occupancy() should work if you pass in Composition objects that correspond to an integer number of each species in the supercell. So for example, if there are 100 sites in your supercell, you can create a composition with Cr0.07 Ti0.04 Zr0.02 W 0.03 V0.84, which should work. This is perhaps not the most user-friendly at the moment, so thanks for proposing changes to the code.