Open alex-l-m opened 2 years ago
That would be nice. Can you point to the CIF specification? Last time I looked into it, it was unclear to me how CIF defined bonds that crossed periodic boundaries.
If we do add, I think StructureGraph -> CIF export would also be useful.
The bond information in CIF files is defined by the GEOM_BOND category. It's kind of annoying to set up due to the non-intuitive periodicity flags, but it can be done. Bonding across boundaries is specified using a flag of the format 1_XYZ
. If I recall correctly (it's been a long time...), it's such that any number after the underscore other than 5 indicates a cross over a periodic boundary. For example, I think 1_545
specifies that it crosses the boundary in the -y dimension, whereas 1_556
specifies that it crosses the boundary in the +z dimension. Or something very similar to this.
Thanks @arosen93, that info on the periodicity labels _geom_bond_site_symmetry_1
etc. was exactly what I was missing. More info on that tag here: https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Igeom_bond_site_symmetry_.html
@alex-l-m do you have a sample CIF file with bonding information that we could incorporate into pymatgen as a test case?
Hi, this would be a great feature to have! I am attaching an example cif with the bond info :)
# CIF file generated by openbabel 2.4.90, see http://openbabel.sf.net
data_I
_chemical_name_common 'str_m12_o9_o13_f0_bcu.sym.134'
_cell_length_a 15.0575
_cell_length_b 15.1814
_cell_length_c 16.8456
_cell_angle_alpha 108.912
_cell_angle_beta 105.512
_cell_angle_gamma 109.889
_space_group_name_H-M_alt 'P 1'
_space_group_name_Hall 'P 1'
loop_
_symmetry_equiv_pos_as_xyz
x,y,z
loop_
_atom_site_label
_atom_site_type_symbol
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
C0 C 0.08427 0.29217 0.08263 1.000
C1 C 0.91134 0.72437 0.96799 1.000
C2 C 0.02437 0.58531 0.74592 1.000
C3 C 0.96994 0.53523 0.27556 1.000
C4 C 0.76020 0.19775 0.72483 1.000
C5 C 0.33697 0.78219 0.25411 1.000
C6 C 0.30651 0.47935 0.98438 1.000
C7 C 0.63846 0.42759 0.93764 1.000
C8 C 0.91112 0.93227 0.04701 1.000
C9 C 0.09230 0.11281 0.10333 1.000
C10 C 0.00420 0.92694 0.05744 1.000
C11 C 0.99663 0.11391 0.07721 1.000
C12 C 0.09743 0.02056 0.08989 1.000
C13 C 0.90545 0.02428 0.05526 1.000
C14 C 0.71802 0.95340 0.95291 1.000
C15 C 0.28234 0.07865 0.18993 1.000
C16 C 0.60326 0.99837 0.01510 1.000
C17 C 0.40056 0.05270 0.12294 1.000
C18 C 0.48628 0.56359 0.27606 1.000
C19 C 0.50672 0.32366 0.64285 1.000
C20 C 0.48091 0.54814 0.18811 1.000
C21 C 0.49355 0.33012 0.72323 1.000
C22 C 0.93468 0.52380 0.74922 1.000
C23 C 0.06291 0.55049 0.26576 1.000
C24 C 0.45126 0.44758 0.12025 1.000
C25 C 0.47645 0.41134 0.77444 1.000
C26 C 0.42986 0.36334 0.14243 1.000
C27 C 0.47005 0.48446 0.74216 1.000
C28 C 0.43482 0.37876 0.23018 1.000
C29 C 0.48288 0.47782 0.66158 1.000
C30 C 0.46059 0.47795 0.29728 1.000
C31 C 0.50376 0.39938 0.61206 1.000
C32 C 0.51643 0.40195 0.00422 1.000
C33 C 0.39120 0.44291 0.88187 1.000
C34 C 0.53471 0.40743 0.92849 1.000
C35 C 0.38375 0.45256 0.96522 1.000
C36 C 0.46754 0.42052 0.86238 1.000
C37 C 0.44639 0.43058 0.02682 1.000
C38 C 0.61880 0.94325 0.94027 1.000
C39 C 0.38174 0.09070 0.20079 1.000
C40 C 0.95912 0.58124 0.36227 1.000
C41 C 0.02800 0.63229 0.68164 1.000
C42 C 0.12933 0.71851 0.61392 1.000
C43 C 0.85183 0.56103 0.44638 1.000
C44 C 0.77698 0.20321 0.81194 1.000
C45 C 0.28065 0.78425 0.17571 1.000
C46 C 0.12238 0.70603 0.69081 1.000
C47 C 0.86035 0.54880 0.36404 1.000
C48 C 0.93616 0.59493 0.60393 1.000
C49 C 0.04850 0.64882 0.44657 1.000
C50 C 0.94195 0.60944 0.52789 1.000
C51 C 0.04031 0.66253 0.52986 1.000
C52 C 0.71958 0.10268 0.63794 1.000
C53 C 0.44598 0.85643 0.32010 1.000
C54 C 0.59148 0.03144 0.39124 1.000
C55 C 0.61668 0.91305 0.55199 1.000
C56 C 0.49966 0.94972 0.31608 1.000
C57 C 0.68440 0.00306 0.63561 1.000
C58 C 0.49278 0.84030 0.39432 1.000
C59 C 0.70089 0.10975 0.55462 1.000
C60 C 0.58186 0.92296 0.47121 1.000
C61 C 0.62833 0.02099 0.47142 1.000
C62 C 0.68705 0.06412 0.10257 1.000
C63 C 0.32013 0.00309 0.03400 1.000
C64 C 0.78709 0.07671 0.11504 1.000
C65 C 0.22064 0.99138 0.02275 1.000
C66 C 0.27673 0.55014 0.96545 1.000
C67 C 0.69672 0.39596 0.98754 1.000
C68 C 0.80302 0.02105 0.04027 1.000
C69 C 0.20082 0.02771 0.10082 1.000
C70 C 0.78225 0.29834 0.73432 1.000
C71 C 0.27441 0.68998 0.25265 1.000
C72 C 0.25528 0.44334 0.03211 1.000
C73 C 0.70404 0.49802 0.91860 1.000
C74 C 0.10642 0.60045 0.82024 1.000
C75 C 0.89016 0.46115 0.18837 1.000
C76 C 0.91242 0.20510 0.00246 1.000
C77 C 0.08457 0.80538 0.05117 1.000
C78 C 0.99613 0.20051 0.05639 1.000
C79 C 0.99949 0.82293 0.02891 1.000
H80 H 0.16125 0.32392 0.13610 1.000
H81 H 0.83135 0.70787 0.93520 1.000
H82 H 0.72878 0.90860 0.89487 1.000
H83 H 0.26880 0.10912 0.25072 1.000
H84 H 0.51190 0.64245 0.32739 1.000
H85 H 0.51837 0.25909 0.60503 1.000
H86 H 0.50034 0.61456 0.17296 1.000
H87 H 0.49684 0.27154 0.74565 1.000
H88 H 0.40844 0.28548 0.09140 1.000
H89 H 0.45719 0.54803 0.78025 1.000
H90 H 0.41773 0.31292 0.24605 1.000
H91 H 0.47808 0.53524 0.63809 1.000
H92 H 0.56743 0.39101 0.05493 1.000
H93 H 0.33798 0.45340 0.83264 1.000
H94 H 0.12510 0.68214 0.44778 1.000
H95 H 0.86084 0.54460 0.59721 1.000
H96 H 0.78846 0.50779 0.30271 1.000
H97 H 0.19280 0.74752 0.75410 1.000
H98 H 0.20491 0.76536 0.61846 1.000
H99 H 0.77522 0.52547 0.44495 1.000
H100 H 0.85622 0.50749 0.71060 1.000
H101 H 0.14133 0.60185 0.31920 1.000
H102 H 0.55371 0.89165 0.87274 1.000
H103 H 0.44430 0.12993 0.26958 1.000
H104 H 0.73209 0.18526 0.55455 1.000
H105 H 0.45465 0.76821 0.39794 1.000
H106 H 0.70169 0.99438 0.69833 1.000
H107 H 0.46778 0.96307 0.25822 1.000
H108 H 0.62762 0.10574 0.38984 1.000
H109 H 0.58352 0.83769 0.55225 1.000
H110 H 0.52616 0.98946 0.00531 1.000
H111 H 0.47762 0.06221 0.13173 1.000
H112 H 0.67448 0.10546 0.16074 1.000
H113 H 0.33501 0.97423 0.97378 1.000
H114 H 0.85133 0.12743 0.18321 1.000
H115 H 0.15908 0.95383 0.95362 1.000
H116 H 0.76785 0.14020 0.83056 1.000
H117 H 0.31080 0.83562 0.14707 1.000
H118 H 0.31233 0.60136 0.93937 1.000
H119 H 0.66599 0.32839 0.99935 1.000
H120 H 0.77781 0.32416 0.68147 1.000
H121 H 0.29757 0.65422 0.29473 1.000
H122 H 0.25723 0.38609 0.05621 1.000
H123 H 0.68325 0.53284 0.87468 1.000
H124 H 0.83075 0.15379 0.97779 1.000
H125 H 0.16359 0.85952 0.10062 1.000
H126 H 0.84092 0.86321 0.02877 1.000
H127 H 0.16411 0.18090 0.12359 1.000
N128 N 0.05566 0.34855 0.04562 1.000
N129 N 0.94434 0.65145 0.95438 1.000
N130 N 0.96083 0.50785 0.82370 1.000
N131 N 0.03917 0.49215 0.17630 1.000
N132 N 0.81129 0.30341 0.87165 1.000
N133 N 0.18871 0.69659 0.12835 1.000
N134 N 0.20520 0.55299 0.99767 1.000
N135 N 0.79480 0.44701 0.00233 1.000
N136 N 0.81480 0.36183 0.82370 1.000
N137 N 0.18520 0.63817 0.17630 1.000
N138 N 0.20168 0.49457 0.04562 1.000
N139 N 0.79832 0.50543 0.95438 1.000
N140 N 0.06842 0.56053 0.87165 1.000
N141 N 0.93158 0.43947 0.12835 1.000
N142 N 0.94807 0.29586 0.99767 1.000
N143 N 0.05193 0.70414 0.00233 1.000
Ni144 Ni 0.11934 0.49342 0.10996 1.000
Ni145 Ni 0.88066 0.50658 0.89004 1.000
Ni146 Ni 0.87259 0.37259 0.00000 1.000
Ni147 Ni 0.12741 0.62741 0.00000 1.000
O148 O 0.46722 0.48837 0.38403 1.000
C149 C 0.42695 0.55494 0.42131 1.000
H150 H 0.35373 0.54062 0.36841 1.000
H151 H 0.48676 0.63880 0.45517 1.000
H152 H 0.40769 0.53839 0.47575 1.000
O153 O 0.51338 0.39669 0.53148 1.000
C154 C 0.58545 0.36138 0.51562 1.000
H155 H 0.54825 0.27392 0.48248 1.000
H156 H 0.65753 0.39671 0.58014 1.000
H157 H 0.60946 0.38620 0.46637 1.000
O158 O 0.21125 0.65560 0.84420 1.000
C159 C 0.23641 0.59272 0.78035 1.000
H160 H 0.21386 0.51360 0.77769 1.000
H161 H 0.19888 0.58148 0.70908 1.000
H162 H 0.32203 0.63262 0.80288 1.000
O163 O 0.78525 0.41314 0.16592 1.000
C164 C 0.75499 0.30844 0.15067 1.000
H165 H 0.79856 0.30542 0.21376 1.000
H166 H 0.76534 0.26297 0.09023 1.000
H167 H 0.67087 0.26987 0.13475 1.000
loop_
_geom_bond_atom_site_label_1
_geom_bond_atom_site_label_2
_geom_bond_distance
_geom_bond_site_symmetry_2
_ccdc_geom_bond_type
Ni146 N135 1.88315 . S
Ni146 N141 1.85224 . S
Ni146 N132 1.85224 1_554 S
Ni146 N142 1.88287 1_554 S
Ni147 N143 1.88287 . S
Ni147 N133 1.85224 . S
Ni147 N140 1.85224 1_554 S
Ni147 N134 1.88315 1_554 S
N135 N139 1.37737 1_554 S
N135 C67 1.32029 1_554 S
N143 C77 1.31556 . S
N143 N129 1.38433 1_454 S
C76 C78 1.38112 . S
C76 H124 1.07554 1_554 S
C76 N142 1.33612 1_554 S
C32 C37 1.36569 . S
C32 H92 1.08388 . S
C32 C34 1.39695 1_554 S
H110 C16 1.08156 . S
C16 C62 1.39809 1_565 S
C16 C38 1.39680 1_554 S
C65 C63 1.39891 1_565 S
C65 C69 1.40520 1_565 S
C65 H115 1.08314 1_554 S
C37 C24 1.48695 . S
C37 C35 1.41173 1_554 S
H126 C8 1.08367 . S
C79 C77 1.37575 1_655 S
C79 C10 1.46394 1_655 S
C79 C1 1.40665 1_554 S
C72 N138 1.31109 . S
C72 H122 1.07732 . S
C72 C6 1.36747 1_554 S
C63 C17 1.39865 . S
C63 H113 1.08230 1_544 S
C68 C13 1.47560 . S
C68 C64 1.40279 . S
C68 C14 1.40612 1_544 S
N128 C0 1.32672 . S
N128 Ni144 1.84262 . S
N128 N142 1.38433 1_454 S
N138 Ni144 1.85163 . S
N138 N134 1.37737 1_554 S
C8 C13 1.39272 1_565 S
C8 C10 1.39866 1_655 S
C77 H125 1.07050 . S
C13 C11 1.41583 . S
C78 C11 1.46819 . S
C78 C0 1.38697 1_655 S
C10 C12 1.42044 1_565 S
C11 C9 1.39683 1_655 S
C0 H80 1.08042 . S
C12 C69 1.47923 . S
C12 C9 1.37691 . S
H166 C164 1.11344 . S
H88 C26 1.08348 . S
C69 C15 1.40501 . S
C62 C64 1.39941 . S
C62 H112 1.08225 . S
C9 H127 1.08056 . S
Ni144 N131 1.85100 . S
Ni144 N137 1.84227 . S
C64 H114 1.08315 . S
C24 C26 1.40458 . S
C24 C20 1.40304 . S
C17 H111 1.08142 . S
C17 C39 1.39726 . S
N133 C45 1.33846 . S
N133 N137 1.37738 . S
N141 N131 1.38433 1_655 S
N141 C75 1.33446 . S
H167 C164 1.10953 . S
C26 C28 1.39661 . S
H117 C45 1.08049 . S
C164 O163 1.40714 . S
C164 H165 1.11406 . S
O163 C75 1.37402 . S
H86 C20 1.08328 . S
C45 C5 1.38205 . S
N131 C23 1.34710 . S
N137 C71 1.33357 . S
C20 C18 1.39805 . S
C75 C3 1.38984 . S
C15 C39 1.39730 . S
C15 H83 1.08311 . S
C39 H103 1.08257 . S
C28 H90 1.08237 . S
C28 C30 1.40002 . S
C71 C5 1.38413 . S
C71 H121 1.07931 . S
C5 C53 1.46735 . S
H107 C56 1.07955 . S
C23 C3 1.40297 1_455 S
C23 H101 1.07827 . S
C3 C40 1.48026 . S
C18 C30 1.41448 . S
C18 H84 1.07988 . S
C30 O148 1.38993 . S
H96 C47 1.07378 . S
C56 C53 1.40577 . S
C56 C54 1.39002 1_565 S
C53 C58 1.40167 . S
C40 C47 1.41083 . S
C40 C49 1.40510 1_655 S
C47 C43 1.38538 . S
H150 C149 1.11478 . S
O148 C149 1.41160 . S
H108 C54 1.08409 . S
C54 C61 1.39843 . S
C58 H105 1.07960 . S
C58 C60 1.38690 . S
C149 H151 1.10986 . S
C149 H152 1.10922 . S
H99 C43 1.08440 . S
C43 C50 1.39193 . S
C49 H94 1.08184 . S
C49 C51 1.39556 . S
H157 C154 1.10951 . S
C60 C61 1.41109 1_565 S
C60 C55 1.39773 . S
C61 C59 1.38883 . S
H155 C154 1.11119 . S
C154 O153 1.41224 . S
C154 H156 1.11448 . S
C50 C51 1.41012 1_655 S
C50 C48 1.38802 . S
C51 C42 1.39770 . S
O153 C31 1.39235 . S
C55 H109 1.08356 . S
C55 C57 1.38824 1_565 S
H104 C59 1.08136 . S
C59 C52 1.40280 . S
H95 C48 1.08093 . S
C48 C41 1.40159 1_655 S
H85 C19 1.08143 . S
C31 C19 1.41477 . S
C31 C29 1.40139 . S
C42 H98 1.08366 . S
C42 C46 1.39383 . S
C57 C52 1.40538 . S
C57 H106 1.08058 1_545 S
C52 C4 1.46648 . S
H91 C29 1.08282 . S
C19 C21 1.39758 . S
C29 C27 1.39834 . S
H120 C70 1.07828 . S
C41 C46 1.40871 . S
C41 C2 1.47988 . S
C46 H97 1.07860 . S
H161 C159 1.10839 . S
H100 C22 1.07965 . S
C21 H87 1.08326 . S
C21 C25 1.40573 . S
C4 C70 1.39149 . S
C4 C44 1.39025 . S
C70 N136 1.33400 . S
C27 C25 1.40464 . S
C27 H89 1.08299 . S
C2 C22 1.38375 1_455 S
C2 C74 1.39267 . S
C22 N130 1.33623 . S
C25 C36 1.48915 . S
H160 C159 1.11339 . S
C159 H162 1.10950 . S
C159 O158 1.41770 . S
C44 H116 1.07733 . S
C44 N132 1.33718 . S
C74 O158 1.37721 . S
C74 N140 1.34683 . S
N136 N132 1.37738 . S
N136 Ni145 1.84227 . S
N130 N140 1.38433 1_655 S
N130 Ni145 1.85100 . S
H93 C33 1.08274 . S
C36 C33 1.39283 . S
C36 C34 1.40721 . S
H102 C38 1.08247 . S
H123 C73 1.08029 . S
C33 C35 1.40259 . S
Ni145 N139 1.85163 . S
Ni145 N129 1.84262 . S
H82 C14 1.08332 . S
C73 C7 1.37650 . S
C73 N139 1.33457 . S
C34 C7 1.43985 . S
H81 C1 1.08051 . S
C7 C67 1.36396 . S
H118 C66 1.07185 . S
C38 C14 1.39748 . S
N129 C1 1.34007 . S
C35 C6 1.43524 . S
C66 C6 1.38970 . S
C66 N134 1.33637 . S
C67 H119 1.07920 . S
There is a StructureGraph class, and some .cif files include bonding information. Is there any interest in creating StructureGraph's from the bonding information in a .cif file?