materialsproject / pymatgen

Python Materials Genomics (pymatgen) is a robust materials analysis code that defines classes for structures and molecules with support for many electronic structure codes. It powers the Materials Project.
https://pymatgen.org
Other
1.49k stars 859 forks source link

how I get the spacegroup symbol from a cif file #192

Closed wangnumber14 closed 9 years ago

wangnumber14 commented 9 years ago

I parse some cif files use the method "CifParser", while all the spacegroup names started with "I" are all parsed into "P", why?

for example, the icsd_044367.cif Li has the spacegroup_name "Im-3m", I use

SpacegroupAnalyzer(IStructure.from_file(icsd_044367.cif).get_spacegroup_symbol()

I get the result: Pm-3m

How can I get the "Im-3m" ?

thank you !

shyuep commented 9 years ago

Can you pls supply the actual if file so that we can better advise on what might be the problem? 

Shyue Ping

On March 5, 2015 at 01:03:01, wangnumber14 (notifications@github.com) wrote:

I parse some cif files use the method "CifParser", while all the spacegroup names started with "I" are all parsed into "P", why?

for example, the icsd_044367.cif Li has the spacegroup_name "Im-3m", I use

SpacegroupAnalyzer(IStructure.from_file(icsd_044367.cif).get_spacegroup_symbol()

I get the result: Pm-3m

How can I get the "Im-3m" ?

thank you !

— Reply to this email directly or view it on GitHub.

wangnumber14 commented 9 years ago

Thanks, for example, the following cif file: #####################################

------------------------------------------------------------------------------

$Date: 2014-07-11 14:35:18 +0000 (Fri, 11 Jul 2014) $

$Revision: 120071 $

$URL: file:///home/coder/svn-repositories/cod/cif/1/00/11/1001196.cif $

------------------------------------------------------------------------------

#

This file is available in the Crystallography Open Database (COD),

http://www.crystallography.net/

#

All data on this site have been placed in the public domain by the

contributors.

# data_1001196 _chemical_name_systematic 'Trizirconium germanium oxide' _chemical_formula_structural 'Zr3 Ge O8' _chemical_formula_sum 'Ge O8 Zr3' _publ_sectiontitle ; Neutron Diffraction Determination of the Structure of an Ordered Scheelite - Type: Zr~3~ Ge O~8~ ; loop _publ_author_name 'Ennaciri, A' 'Michel, D' 'Perez y Jorba, M' 'Pannetier, J' _journal_name_full 'Materials Research Bulletin' _journal_coden_ASTM MRBUAC _journal_volume 19 _journal_year 1984 _journal_page_first 793 _journal_page_last 799 _cell_length_a 5.005(1) _cell_length_b 5.005(1) _cell_length_c 10.452(2) _cell_angle_alpha 90 _cell_angle_beta 90 _cell_angle_gamma 90 _cell_volume 261.8 _cell_formula_units_Z 2 _symmetry_space_group_name_H-M 'I -4 2 m' _symmetry_Int_Tables_number 121 _symmetry_cellsetting tetragonal loop _symmetry_equiv_pos_asxyz 'x,y,z' '-x,-y,z' '-x,y,-z' 'x,-y,-z' '-y,x,-z' 'y,-x,-z' 'y,x,z' '-y,-x,z' '1/2+x,1/2+y,1/2+z' '1/2-x,1/2-y,1/2+z' '1/2-x,1/2+y,1/2-z' '1/2+x,1/2-y,1/2-z' '1/2-y,1/2+x,1/2-z' '1/2+y,1/2-x,1/2-z' '1/2+y,1/2+x,1/2+z' '1/2-y,1/2-x,1/2+z' loop _atom_type_symbol _atom_type_oxidationnumber Zr4+ 4.000 Ge4+ 4.000 O2- -2.000 loop _atom_site_label _atom_site_type_symbol _atom_site_symmetry_multiplicity _atom_site_Wyckoff_symbol _atom_site_fract_x _atom_site_fract_y _atom_site_fract_z _atom_site_occupancy _atom_site_attached_hydrogens _atom_site_calc_flag Zr1 Zr4+ 2 b 0. 0. 0.5 1. 0 d Zr2 Zr4+ 4 d 0. 0.5 0.25 1. 0 d Ge1 Ge4+ 2 a 0. 0. 0. 1. 0 d O1 O2- 8 i 0.2004(5) 0.2004(5) 0.3410(6) 1. 0 d O2 O2- 8 i 0.2170(5) 0.2170(5) 0.0904(6) 1. 0 d _refine_ls_R_factor_all 0.027 _cod_database_code 1001196 _journal_paper_doi 10.1016/0025-5408(84)90037-0 ##############################################################

I use the code to get spacegroup information: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% s=IStructure.from_file(name.cif) syms=SpacegroupAnalyzer(s) spg_symbol=syms.get_spacegroup_symbol() %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

and then I get the result is : " P-42m" not the real spacegroup "I -4 2 m".

shyuep commented 9 years ago

s = Structure.from_file("1001196.cif") from pymatgen.symmetry.analyzer import SpacegroupAnalyzer a = SpacegroupAnalyzer(s) print a.get_spacegroup_symbol() I can’t reproduce the error. The following gives me the right space group

Shyue Ping

On March 6, 2015 at 00:50:39, wangnumber14 (notifications@github.com) wrote:

Thanks, for example, the following cif file: #####################################

------------------------------------------------------------------------------

$Date: 2014-07-11 14:35:18 +0000 (Fri, 11 Jul 2014) $

$Revision: 120071 $

$URL: file:///home/coder/svn-repositories/cod/cif/1/00/11/1001196.cif $

------------------------------------------------------------------------------

#

This file is available in the Crystallography Open Database (COD),

http://www.crystallography.net/

#

All data on this site have been placed in the public domain by the

contributors.

# data_1001196 chemical_name_systematic 'Trizirconium germanium oxide' _chemical_formula_structural 'Zr3 Ge O8' _chemical_formula_sum 'Ge O8 Zr3' _publ_section_title ; Neutron Diffraction Determination of the Structure of an Ordered Scheelite - Type: Zr~3~ Ge O~8~ ; loop publ_author_name 'Ennaciri, A' 'Michel, D' 'Perez y Jorba, M' 'Pannetier, J' _journal_name_full 'Materials Research Bulletin' _journal_coden_ASTM MRBUAC _journal_volume 19 _journal_year 1984 _journal_page_first 793 _journal_page_last 799 _cell_length_a 5.005(1) _cell_length_b 5.005(1) _cell_length_c 10.452(2) _cell_angle_alpha 90 _cell_angle_beta 90 _cell_angle_gamma 90 _cell_volume 261.8 _cell_formula_units_Z 2 _symmetry_space_group_name_H-M 'I -4 2 m' _symmetry_Int_Tables_number 121 _symmetry_cell_setting tetragonal loop symmetry_equiv_pos_as_xyz 'x,y,z' '-x,-y,z' '-x,y,-z' 'x,-y,-z' '-y,x,-z' 'y,-x,-z' 'y,x,z' '-y,-x,z' '1/2+x,1/2+y,1/2+z' '1/2-x,1/2-y,1/2+z' '1/2-x,1/2+y,1/2-z' '1/2+x,1/2-y,1/2-z' '1/2-y,1/2+x,1/2-z' '1/2+y,1/2-x,1/2-z' '1/2+y,1/2+x,1/2+z' '1/2-y,1/2-x,1/2+z' loop atom_type_symbol _atom_type_oxidation_number Zr4+ 4.000 Ge4+ 4.000 O2- -2.000 loop _atom_site_label _atom_site_type_symbol _atom_site_symmetry_multiplicity _atom_site_Wyckoff_symbol _atom_site_fract_x _atom_site_fract_y _atom_site_fract_z _atom_site_occupancy _atom_site_attached_hydrogens _atom_site_calc_flag Zr1 Zr4+ 2 b 0. 0. 0.5 1. 0 d Zr2 Zr4+ 4 d 0. 0.5 0.25 1. 0 d Ge1 Ge4+ 2 a 0. 0. 0. 1. 0 d O1 O2- 8 i 0.2004(5) 0.2004(5) 0.3410(6) 1. 0 d O2 O2- 8 i 0.2170(5) 0.2170(5) 0.0904(6) 1. 0 d _refine_ls_R_factor_all 0.027 _cod_database_code 1001196 _journal_paper_doi 10.1016/0025-5408(84)90037-0 ##############################################################

I use the code to get spacegroup information: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% s=IStructure.from_file(name.cif) syms=SpacegroupAnalyzer(s) spg_symbol=syms.get_spacegroup_symbol() %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

and then I get the result is : " P-42m" not the real spacegroup "I -4 2 m".

— Reply to this email directly or view it on GitHub.

wangnumber14 commented 9 years ago

Sorry, I can not see your result, can you resend me again?

I still get the error.

%%%%%%%%%%%%%%%%%%%% s1=IStructure.from_file("1001196.cif") print SpacegroupAnalyzer(s1).get_spacegroup_symbol() %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% the result: %%%%%%%%%%%%%%%%%%%% [wangzg@localhost parsecif]$ python2.7 Cif2CmlPoscar.py P-42m %%%%%%%%%%%%%%%%%%%%%%%%%% not the real result is I-42m

thanks

shyuep commented 9 years ago

I get the real result of I-42m. I am not sure why you are getting the P-42m. Do you have the latest pymatgen and spglib installed?Are you on Windows or a Linux / Mac system?

jespertoftkristensen commented 9 years ago

Out of curiosity: Dr. Ong, what happens when you use IStructure?

On Thu, Mar 19, 2015 at 8:42 PM, Shyue Ping Ong notifications@github.com wrote:

I get the real result of I-42m. I am not sure why you are getting the P-42m. Do you have the latest pymatgen and spglib installed?Are you on Windows or a Linux / Mac system?

— Reply to this email directly or view it on GitHub https://github.com/materialsproject/pymatgen/issues/192#issuecomment-83828110 .

wangnumber14 commented 9 years ago

I have installed the latest version (3.0.11) on Linux system yesterday, but I still reproduced this error.

I think the error results from symmetry operations, the "_symmetry_equiv_pos_as_xyz" . I use castep to parser cif file and save it as new file in the cif format, and then I use pymatgen to parser it, I get the correct result. I have checked the symmetry operations, there are no differences except that the orders.

I install the new pymatgen version,the spglib should install at same time. right?

shyuep commented 9 years ago

IStructure and Structure are just immutable nad mutable versions. The symmetry detection does not rely on changing the structure. So either one is fine.

I downloaded the cif directly from the website and it parses completely fine. Which is why I am confused why there is a problem. The cif text pasted above has missing _ in front of "symmetry_equiv_pos_asxyz", which could be the reason. That underscore cannot be missing, or the code will not parse correctly. I have never seen a CIF without that though.

wangnumber14 commented 9 years ago

I don not think the _missing is the reason, I have send one attachment to your email, including the python shell and cif file, you can test it again.

below is the results: %%%%%%%%%%%%%%%% [wang@localhost 1]$ python2.7 test.py Pm-3m Pm-3m (221) spacegroup %%%%%%%%%%%%%%%%%%

Thank you very much.

shyuep commented 9 years ago

I tried it again. It really does not make a difference. The original cif file that you supplied was determined correctly. The following is an output from my testing. test.cif contains the CIF you supplied in this chain. I also tested with a Fe Im-3m structure that I downloaded from ICSD (Icsd id 53802). The spacegroup analyzer still gives Im-3m as it should, not Pm-3m. I also tried this on two different OSes, a Linux OS and a Mac OS. As expected, using IStructure or Structure makes no difference to the results.

Can you email me the CIF file that you are testing with directly to shyuep@gmail.com? If you use Github, the file does not come through.

In [3]: s = IStructure.from_file("test.cif")
In [4]: print s
Structure Summary (Zr3 Ge1 O8)
Reduced Formula: Zr3GeO8
abc   :   5.005000   5.005000   6.311584
angles: 113.359134 113.359134  90.000000
Sites (12)
1 Zr4+     0.500000     0.500000     0.000000
2 Zr4+     0.250000     0.750000     0.500000
3 Zr4+     0.750000     0.250000     0.500000
4 Ge4+     0.000000     0.000000     0.000000
5 O2-     0.541400     0.541400     0.682000
6 O2-     0.140600     0.140600     0.682000
7 O2-     0.458600     0.859400     0.318000
8 O2-     0.859400     0.458600     0.318000
9 O2-     0.307400     0.307400     0.180800
10 O2-     0.873400     0.873400     0.180800
11 O2-     0.692600     0.126600     0.819200
12 O2-     0.126600     0.692600     0.819200
In [6]: from pymatgen.symmetry.analyzer import SpacegroupAnalyzer

In [7]: a = SpacegroupAnalyzer(s, 0.1)

In [8]: print a.get_spacegroup_symbol()
I-42m
shyuep commented 9 years ago

Also, can you make sure of your pymatgen version by typing

import pymatgen
print pymatgen.__version__

It would also be helpful to know your numpy versions. I have never encountered issues there before, but just in case. Also, when you directly print the structure, do you get the same structure as what I show above? If so, then something is wrong with the symmetry determination nad not with the parsing of the file itself.

wangnumber14 commented 9 years ago

I have print my pymatgen version: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Python 2.7.7 (default, Jun 12 2014, 08:23:58) [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

import pymatgen print pymatgen.version 3.0.11 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

and I loaded the Fe cif file (data_53802), I still reproduced the error.

I have send the attachments in the email shyuep@gmail.com, please check your email.

Thanks

shyuep commented 9 years ago

Thanks for the files. I determined that the problem with your cif is that it has the x y z operations as "x+0.5, y+0.5, z+0.5".

These were parsed incorrectly as the cif files we encountered always were in the form: "x+1/2, y+1/2, z+1/2".

I just pushed a patch to the dev version of pymatgen that fixes that.

shyuep commented 9 years ago

Pls let me know if there are any other cifs that are not parsing correctly.

wangnumber14 commented 9 years ago

I also think the ".5" form is the main reason, if I can download the latest version to fix the problem? Or need I correct the pymatgen code?

wangnumber14 commented 9 years ago

I have updated the pymatgen version and checked the files, I got the correct results, Thanks a lot.

I have ever tested that changing the ".5" form to "1/2" and got the correct results, but I don not think that you will make this mistake, so I still ask you the real reason, sorry to waste your long time to this question, and thanks a lot for your patience.