Open lfoppiano opened 8 months ago
This step primarily occurs in ~text2chem.regex_parser.separate_oxygen_deficiency
. Its purpose is to separate the deficient or excess (commonly denoted as +-δ in chemistry) oxygen atoms from the formula. The final result reflects this separation in the oxygen_deficiency
:
{'material_string': 'Bi2Sr2CaCu2O8-δ',
'material_name': '',
'material_formula': 'Bi2Sr2CaCu2O8',
'additives': [], 'phase': '',
'oxygen_deficiency': '-',
'amounts_x': {},
'elements_x': {},
'composition': [{'formula': 'Bi2Sr2CaCu2O8', 'amount': '1', 'elements': OrderedDict([('Bi', '2'), ('Sr', '2'), ('Ca', '1'), ('Cu', '2'), ('O', '8')]), 'species': OrderedDict([('Bi', '2'), ('Sr', '2'), ('Ca', '1'), ('Cu', '2'), ('O', '8')])}]}
{'material_string': 'Bi2Sr2CaCu2O8+δ',
'material_name': '',
'material_formula': 'Bi2Sr2CaCu2O8',
'additives': [], 'phase': '',
'oxygen_deficiency': '+',
'amounts_x': {},
'elements_x': {},
'composition': [{'formula': 'Bi2Sr2CaCu2O8', 'amount': '1', 'elements': OrderedDict([('Bi', '2'), ('Sr', '2'), ('Ca', '1'), ('Cu', '2'), ('O', '8')]), 'species': OrderedDict([('Bi', '2'), ('Sr', '2'), ('Ca', '1'), ('Cu', '2'), ('O', '8')])}]}
Specifically, if oxygen_deficiency='-'
, oxygen should be represented as 8-δ, and similarly for other cases.
Am I understanding well that it's correct that δ is not included in the composition?
Am I understanding well that it's correct that δ is not included in the composition?
Based on the design by the code owner and my understanding, it seems to be like this.
The formula:
Bi2Sr2CaCu2O 8+δ
is incorrectly parsed bymaterial_parser.parse()
as:Oxygen should be 8+δ
It seems a problem only with the latest element + amount