jensengroup / xyz2mol

Converts an xyz file to an RDKit mol object
MIT License
250 stars 70 forks source link

Problem with assignment of charges #18

Closed finrodfelagund13 closed 4 years ago

finrodfelagund13 commented 4 years ago

Many thanks for writing this program!

I encountered some problems while generating SMILES strings from xyz files for Bodipys. The expected structure with most mesomeric weight would have a negative charge on the Boron and a positive charge on the neighboring Nitrogen. The script correctly recognises the zwitterionic state but in some cases places the cationic charge on other hetero atoms:

example output [xyz2mol]: F[B-]1(F)n2c(cc3c2=c2sccc2=[S+]3)=C(C(F)(F)F)c2cc3sc4ccsc4c3n21

expected output [non-canonical]: FC(F)(F)C=5c3cc2sc1ccsc1c2n3[B-](F)(F)[N+]4=C6C(=CC4=5)Sc7ccsc67

This happens for a number of different heteroatom placements/variations: F[B-]1(F)n2c(ccc2-c2cccs2)C(C(F)(F)F)=c2cc/c(=C3/C=CC=[S+]3)n21 CN(C)c1ccc(-c2ccc3n2[B-](F)(F)n2c(=C4C=CC(=[N+](C)C)C=C4)ccc2=C3C(F)(F)F)cc1

Frank-LIU-520 commented 4 years ago

the same to me!!!

Frank-LIU-520 commented 4 years ago

when I try to transform the following xyz file to sdf file, it cannot be opened by gaussian view for charge problem. 24 -442.7031923602849 C -1.8463 0.1481 0.1120
C -0.9268 -0.1176 1.1204
C 0.6521 -0.1823 0.6140
C 1.1945 1.1409 0.2736
C 2.3716 1.2099 -0.3161
C 3.0615 0.0860 -0.4443
C 2.5688 -1.1303 -0.0487
C 1.2918 -1.2179 0.3001
C -1.9498 -0.8297 -1.0644
C -3.0932 -0.8799 -0.0602
H -4.1580 -0.1733 -0.5462
H -1.9276 -0.2662 -2.3117
H -1.1353 -1.6728 -1.2641
H 1.0471 -2.3573 0.4784
H 3.3170 -2.2603 -0.4711
H 4.0244 0.0812 -1.1774
H -2.6442 1.8950 0.3722
H 0.5121 1.9718 0.2996
H -0.8199 0.8693 2.2716
H -1.0372 -1.0610 1.5292
H -3.1043 -1.4822 0.7742
H -2.5033 1.6696 -1.1309
H 2.5978 2.6680 -0.3033
N -2.0161 1.5365 -0.2055

Warning: SMILES charge doesn't match input charge

jhjensen2 commented 4 years ago

Sorry for the late reply. There is no easy fix to this since both are chemically valid, but one is more chemically intuitive. In this particular case the SMILES can be fixed by converting to InChi and back to SMILES again, but that is probably not a general fix. So I will close this issue

jhjensen2 commented 4 years ago

@Mario-Liu The problem here is that some of your CH bonds are very long, so they are not recognised as bonds. Try optimising your structure.