aiidalab / aiidalab-widgets-base

Reusable widgets for AiiDAlab applications
MIT License
6 stars 17 forks source link

BUG in Smiles Widget #505

Closed cpignedoli closed 11 months ago

cpignedoli commented 11 months ago

The following SMILES code

C=CC1=C(C2=CC=C(C3=CC=CC=C3)C=C2)C=C(C=C)C(C4=CC=C(C(C=C5)=CC=C5C(C=C6C=C)=C(C=C)C=C6C7=CC=C(C(C=C8)=CC=C8C(C=C9C=C)=C(C=C)C=C9C%10=CC=CC=C%10)C=C7)C=C4)=C1

is correctly handled by Mathematica:

smiles_mathematica

while crashes the smiles widget:

image

https://github.com/aiidalab/aiidalab-widgets-base/blob/413c26c76ec438b9c70afe3437420923f738e50b/aiidalab_widgets_base/structures.py#L679

danielhollas commented 11 months ago

I'll take a look, the error comes from the code that I introduced I think, thanks for the report!

The following SMILES code

C=CC1=C(C2=CC=C(C3=CC=CC=C3)C=C2)C=C(C=C)C(C4=CC=C(C(C=C5)=CC=C5C(C=C6C=C)=C(C=C)C=C6C7=CC=C(C(C=C8)=CC=C8C(C=C9C=C)=C(C=C)C=C9C%10=CC=CC=C%10)C=C7)C=C4)=C1

is correctly handled by Mathematica:

[image: smiles_mathematica] https://user-images.githubusercontent.com/22955065/263716526-9d585406-0c7b-4a47-bc13-89c65ee5f50f.png

while crashes the smiles widget: [image: image] https://user-images.githubusercontent.com/22955065/263716814-69e45759-65ea-4126-9980-698a530b9d85.png

https://github.com/aiidalab/aiidalab-widgets-base/blob/413c26c76ec438b9c70afe3437420923f738e50b/aiidalab_widgets_base/structures.py#L679

— Reply to this email directly, view it on GitHub https://github.com/aiidalab/aiidalab-widgets-base/issues/505, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIY64I626W2BWPPX2BM5ELXXSMTXANCNFSM6AAAAAA4BNZA5A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

danielhollas commented 11 months ago

I noticed a strange thing. The input SMILES is not canonical (at least according to RDKit). The canonical version is

C=Cc1cc(-c2ccc(-c3ccc(-c4cc(C=C)c(-c5ccc(-c6ccc(-c7cc(C=C)c(-c8ccc(-c9ccccc9)cc8)cc7C=C)cc6)cc5)cc4C=C)cc3)cc2)c(C=C)cc1-c1ccccc1

When I input this smiles the generation succeeds. @cpignedoli could you please confirm that the following generated geometry is the molecule that you wanted?

CC @yakutovicha

126

C      10.789406146640598     11.093550987863887     10.775495335663638
C      10.300054323514150     10.660649966173487      9.612813917217883
C      11.044087458291084      9.754560238551454      8.694690148677905
C      12.450498975464544      9.771741109891606      8.658420277266600
C      13.171961786514769      8.899468916084674      7.828249211905409
C      14.653209329627982      8.943297720286033      7.879013462112862
C      15.388012861026390      7.798352029613464      8.223198046072158
C      16.783321971800845      7.845316400135173      8.282815349608248
C      17.470109441631713      9.041547639748901      8.008276391728797
C      18.953054753049308      9.091478726443864      8.071545358625475
C      19.725614259466418      8.029388783934873      7.568656068037813
C      21.120804560742720      8.075486213314020      7.629939944722656
C      21.770381820351080      9.179358278706633      8.203500102453024
C      23.250990599345592      9.231714826326275      8.264561432610403
C      23.903860660217664     10.317181478862233      7.659199481550609
C      25.307537098652755     10.382191256698256      7.589556714716446
C      25.973997006949350     11.575649813337492      7.005036031568991
C      25.459546275072285     12.283138130155274      5.999117896367571
C      26.071218969165187      9.328380939153275      8.145310409359372
C      27.551799643398276      9.304486191597727      8.067700837387832
C      28.200629370384807      9.358470201659426      6.824471161285891
C      29.595724240103443      9.322774291273271      6.752861709556376
C      30.368753975797965      9.225240082611922      7.923715482343153
C      31.851530863265936      9.189182848324915      7.848354553242594
C      32.498520133926782      8.444337812420816      6.846073418941604
C      33.893746958784689      8.412585059797328      6.774229885272454
C      34.668704626344180      9.131963426111268      7.696570000994615
C      36.148932965587925      9.092374267559924      7.621011285073506
C      36.784085051404418      7.841668003616832      7.668941392751486
C      38.181197066134025      7.732486698869110      7.686275138226859
C      38.759962686937143      6.368182790026544      7.677028261260971
C      39.778556186855347      6.021601610330768      6.891412668450279
C      38.967336916073641      8.911450468932689      7.720681660879677
C      40.439185456658208      8.879878098114695      7.896972531738566
C      41.010745158645960      8.212304238765498      8.991174281393935
C      42.397735379199887      8.188774928466096      9.159363589654875
C      43.239961819451224      8.839144066258630      8.239793939282444
C      44.714482325102651      8.812122182387412      8.416185778535240
C      45.285363329073931      8.952479315158913      9.695595094175022
C      46.673897610617388      8.926798320059179      9.858320666109636
C      47.506721361088310      8.760265310242879      8.749679440972686
C      46.952795498459238      8.619502021844056      7.475410832137726
C      45.565003229455471      8.645378209519945      7.306561612916231
C      42.662604235466041      9.517457504186828      7.151403427074022
C      41.275330967962212      9.541258091041994      6.984727986718346
C      38.333033396546355     10.161339273964789      7.646921101522342
C      36.938095770508767     10.268449383599863      7.564334633326729
C      36.364089825029517     11.628205771972041      7.430231241960009
C      35.400658467129780     11.922181039433509      6.558317030007844
C      34.027006019076033      9.866285048310409      8.705841117394819
C      32.631964088867754      9.897849333927708      8.779221220681968
C      29.714218482087361      9.160162695738345      9.166890732643022
C      28.318877538178867      9.195129477929447      9.237196179982393
C      25.418828699165509      8.255414266076892      8.770905851670630
C      24.021933361290465      8.202600853733518      8.859842757736988
C      23.427308688423405      7.057198372432805      9.588383186189890
C      22.455747608268631      7.200136211860595     10.488653947758950
C      21.003481997548349     10.246471114924656      8.695394946696492
C      19.608119385680904     10.201962924735863      8.633444238317603
C      16.729102135380188     10.189875669478356      7.675858344629621
C      15.333714719868826     10.141279710999150      7.615168369062982
C      12.477288869607099      7.990502072377453      6.995757061967427
C      13.213093255134526      7.109277147759734      6.051239790606723
C      12.773692378284579      5.906743639264395      5.679942999456125
C      11.071498194782144      7.994151376890070      7.009287419909342
C      10.353257752587758      8.867018227513944      7.837947106293956
C       8.877616222976119      8.814245396839564      7.790055342208218
C       8.151513284552880      9.840639960592810      7.165119474823999
C       6.755946034589012      9.776715715515992      7.104596794731544
C       6.080152942119149      8.685327336845436      7.658349037144509
C       6.799207251019130      7.654588807589116      8.270817026792406
C       8.194918656634179      7.715155678130834      8.333765972042571
H      10.187998832973708     11.737059662754781     11.404534282634245
H      11.764114959993565     10.795858302282882     11.136628800802134
H       9.290221802424384     10.956220827004273      9.362938407287576
H      13.000100177036481     10.469491980069327      9.277506467819432
H      14.879138008042695      6.870398399547538      8.453901947253922
H      17.324268738116558      6.950524501663429      8.564089722193611
H      19.251321174087089      7.171088303919048      7.109608770656157
H      21.695709247808900      7.252406395764151      7.223561170105164
H      23.304975543604328     11.118265163213557      7.244162688321445
H      26.945909550215539     11.868938711571502      7.382574710949359
H      24.521379841799316     12.015453820719166      5.532128624300915
H      26.002497335304323     13.133610001846623      5.607742899599813
H      27.625272617102461      9.426007378088457      5.909231364311240
H      30.070292283564193      9.386596515198457      5.781760355672381
H      31.925701011830238      7.872366526523075      6.127009930370598
H      34.370443100680191      7.833963395851809      5.992430958394171
H      36.181494549767855      6.940095581299777      7.696121104710627
H      38.271067545082929      5.593538580128561      8.255510903344966
H      40.135258752854334      5.000000000000002      6.889922264745703
H      40.247641216017684      6.734426057496647      6.225971309018672
H      40.381536456404611      7.709752009133784      9.715493596043469
H      42.812131037055963      7.647964473711428     10.000825072702174
H      44.659671225455170      9.099825884643387     10.566862830855973
H      47.104509635795864      9.039595224762294     10.844932218742178
H      48.581247668803620      8.740307461686040      8.878015009551341
H      47.599148205025443      8.486711379708337      6.617421846807042
H      45.156964538791854      8.517275073647095      6.311853839577731
H      43.283121160946237     10.044248838236575      6.437423010253032
H      40.852955639548725     10.068183478934014      6.137848332843241
H      38.934372606575458     11.064098885199279      7.656964436633652
H      36.815087887794554     12.438969257811296      7.989693494834360
H      35.046977416355709     12.940877525534470      6.468448378096624
H      34.975317897688242     11.167663516772809      5.909657658138290
H      34.607666893106163     10.414817138953397      9.437305830089247
H      32.163450585903206     10.489904539672279      9.555276847349536
H      30.280878870619237      9.064516032705486     10.084548914127485
H      27.835934549779580      9.143004197692148     10.205389013273969
H      26.008956681787602      7.453675198184930      9.201905509353221
H      23.869574619859996      6.076386161359432      9.461436510612879
H      22.086995002887136      6.336329937408444     11.026024089468677
H      22.038460563645636      8.171095925519843     10.721160239522604
H      21.486298836242295     11.107007270155737      9.142177215073238
H      19.040955735092723     11.028343747466316      9.042979283533709
H      17.229397877763770     11.121887701017874      7.444826371104768
H      14.783486923309535     11.035885711249289      7.350047226468363
H      14.173344855055177      7.434976141781370      5.670591862993399
H      11.851201038834116      5.491722005994449      6.063243322193171
H      13.363189850778491      5.305541540598276      5.000000000000000
H      10.520527090662846      7.341680007154315      6.342743039789817
H       8.666621598019660     10.687233839956454      6.727826414922109
H       6.198584338655685     10.571746109172304      6.626403954367804
H       5.000000000000000      8.636782669090945      7.610051385716330
H       6.275149763661059      6.808116303366591      8.695392002257584
H       8.743162017466393      6.910398516730581      8.808196661739872
cpignedoli commented 11 months ago

Thanks a lot @danielhollas it is correct and with the canonical form you provided it works. So chemdraw is not producing canonical SMILES. How did you do the conversion? should we put it in the widget?

danielhollas commented 11 months ago

Yes, I will put up a PR with the canonicalization using rdkit. I am already using it in my app for some time and it works well.

cpignedoli commented 11 months ago

great @danielhollas Thanks a lot :)