Closed AstyLavrinenko closed 6 days ago
Hi @AstyLavrinenko I cannot reproduce these errors. Which version of pymatgen are you using? And can you post the full traceback?
The choice to return a structure was made to have richer information available to the user. You can easily extract the average occupancy type per site type by looping over the structure.
You could for example use the composition and divide over the total number of sites:
struct = transitions.occupancy()
struct.composition
My pymatgen version is 2024.6.10
And full traceback:
AttributeError Traceback (most recent call last)
Cell In[31], line 1
----> 1 transitions.occupancy()
File \\tudelft.net\staff-umbrella\Project Data akl\gemdat_dev\GEMDAT\src\gemdat\transitions.py:255, in Transitions.occupancy(self)
252 counts = counts / len(states)
253 occupancies = dict(zip(unq, counts))
--> 255 species = [{site.species.name: occupancies.get(i, 0)} for i, site in enumerate(sites)]
257 return Structure(
258 lattice=sites.lattice,
259 species=species,
(...)
262 labels=sites.labels,
263 )
File \\tudelft.net\staff-umbrella\Project Data akl\gemdat_dev\GEMDAT\src\gemdat\transitions.py:255, in <listcomp>(.0)
252 counts = counts / len(states)
253 occupancies = dict(zip(unq, counts))
--> 255 species = [{site.species.name: occupancies.get(i, 0)} for i, site in enumerate(sites)]
257 return Structure(
258 lattice=sites.lattice,
259 species=species,
(...)
262 labels=sites.labels,
263 )
AttributeError: 'Composition' object has no attribute 'name'
Now I'm thinking that it could be because my structure to create transitions already have partial occupation as I took it from experiment
Regarding the average occupancy, the functions you've proposed work only with one type of sites while usually I introduce at least 2-3 different site types. Maybe it worth introducing additional small function to get average occupations
What would such an output look like? Feel free to open a separate issue for this.
Okay, I can reproduce the error with a sites structure that has partial occupancy. Not sure if it can be 'fixed' though, because a partial site does not make sense to me.
We can either:
After all, we have three main things to fix:
[x] Before creating transitions
, an additional validation step is required to check whether the provided structure has partial occupancy on floating species sites or contains any other atoms besides the floating species. If this is the case, the code should adjust the occupancy of all floating species sites to 1 and delete any other sites that are occupied by non-floating species. Additionally, the code should issue a warning to the user, indicating that, for proper analysis, only floating species sites are being considered under the assumption that all of them are fully occupied. Only after this step is completed should the transition object be created.
For my structures, I used custom_structure.remove_species(['S','Sb','W'])
to delete other species and
for idx,site in enumerate(custom_structure.sites):
custom_structure.replace(idx=idx, species=site.species.elements[0], label=site.label)
to change the occupancy to 1.
[x] There is an error in the current implementation of the atom_locations()
calculation from transitions
. In the original MATLAB code, this metric is defined as the ratio of the number of timesteps that diffusion atoms spend on a specific type of site, divided by the total number of timesteps and the total number of floating species. So it shouldn't be divided by the number of sites of that specific type and multiplied by the total number of sites.
def atom_locations(self):
"""Calculate fraction of time atoms spent at a type of site.
Returns
-------
dict[str, float]
Return dict with the fraction of time atoms spent at a site
"""
multiplier = 1 / self.n_floating
compositions_by_label = defaultdict(list)
for site in self.occupancy():
compositions_by_label[site.label].append(site.species.num_atoms)
ret = {}
for k, v in compositions_by_label.items():
ret[k] = sum(v) * multiplier
return ret
[x] We need to introduce an additional function to compute the average occupancy for each type of site. This is a crucial metric for analysis
def average_occupancy(self):
"""Calculate average occupancy per a type of site.
Returns
-------
dict[str, float]
Return dict with the fraction of time atoms spent at a site
"""
compositions_by_label = defaultdict(list)
for site in self.occupancy():
compositions_by_label[site.label].append(site.species.num_atoms)
ret = {}
for k, v in compositions_by_label.items():
ret[k] = (sum(v) / len(v))
return ret
Could you please do these fixes?
Fair points, I have fixed these issues except for the first one. I think it is up to the user to clean their structure. There is no one way to make a disordered structure ordered, but the pymatgen code to do it yourself is trivial.
We don't need to convert the structure into an ordered one. We only need to account for all possible sites that the floating species might occupy, as this is simply a model of potential sites. In the simulation data, any inherent disorder will naturally appear through the computed occupancies and lithium densities. I think it's better for the code to automatically set the occupancy to 1 for all potential sites of the fluctuating species, rather than expecting the user to manually modify the structure. This approach minimizes user confusion and ensures consistent analysis.
I see your point, I just don't see why gemdat needs to do this. This behaviour is difficult to define and to explain.
The code does not see the disorder, because it only uses the fractional coordinates and site labels. Any specie names are only used for descriptive purposes.
The behaviour you describe would sit behind some sort of flag or toggle, increasing complexity. This would then filter the input sites based on some pre-defined behaviour (floating specie or otherwise), but that is just passing it straight to pymatgen. Therefore I think it is better that the user performs this operation.
I'm trying to get working functions
transitions.occupancy()
andtransitions.atom_locations()
and getting an error:After fixing
specie
tospecies
in the source code, I'm still getting error:Could you please fix that?
Besides that occupancy now seems to appear in the
Structure
object. Is there any additional function to extract average occupancy per each site type?