kjappelbaum / mofchecker

Basic sanity checks for MOFs.
https://mofchecker.readthedocs.io/en/latest/background.html
MIT License
21 stars 4 forks source link

Tracking manual validation #216

Open kjappelbaum opened 1 year ago

kjappelbaum commented 1 year ago

Ran the current master on the RSM MOFs.

kjappelbaum commented 1 year ago

has_undercoordinated_alkali_alkaline

flagged 26 compounds 'RSM1229', 'RSM2019', 'RSM2351', 'RSM2185', 'RSM3014', 'RSM1327', 'RSM1917', 'RSM2843', 'RSM2336', 'RSM2231', 'RSM3969', 'RSM1041', 'RSM1484', 'RSM1596', 'RSM2841', 'RSM1847', 'RSM0535', 'RSM1293', 'RSM1342', 'RSM1712', 'RSM3328', 'RSM0498', 'RSM1227', 'RSM2109', 'RSM2257', 'RSM1085'

manually opened all. I would like a tool flags all those cases to me - all seem suspicious. Some are very off, e.g. RSM2843

kjappelbaum commented 1 year ago

has_lone_molecule

flags (145) many compounds 'RSM3042', 'RSM1706', 'RSM0117', 'RSM3853', 'RSM0910', 'RSM2350', 'RSM3812', 'RSM0879', 'RSM1100', 'RSM1403', 'RSM1116', 'RSM0357', 'RSM2014', 'RSM3605', 'RSM3367', 'RSM3275', 'RSM1272', 'RSM2777', 'RSM4027', 'RSM3127', 'RSM1264', 'RSM0025', 'RSM2624', 'RSM3035', 'RSM1459', 'RSM0423', 'RSM4163', 'RSM1623', 'RSM1273', 'RSM1434', 'RSM1967', 'RSM2473', 'RSM1926', 'RSM2445', 'RSM1117', 'RSM4587', 'RSM0878', 'RSM2994', 'RSM2644', 'RSM0291', 'RSM1300', 'RSM2469', 'RSM0287', 'RSM2570', 'RSM2618', 'RSM2136', 'RSM3232', 'RSM4133', 'RSM2622', 'RSM2710', 'RSM1887', 'RSM0916', 'RSM0691', 'RSM4216', 'RSM0652', 'RSM3438', 'RSM0078', 'RSM1901', 'RSM2443', 'RSM1280', 'RSM2951', 'RSM0038', 'RSM3614', 'RSM3028', 'RSM2639', 'RSM3842', 'RSM2438', 'RSM3415', 'RSM1326', 'RSM0576', 'RSM3161', 'RSM0937', 'RSM2459', 'RSM4237', 'RSM0133', 'RSM2323', 'RSM2636', 'RSM0871', 'RSM0172', 'RSM4424', 'RSM3230', 'RSM0277', 'RSM2749', 'RSM1535', 'RSM2460', 'RSM1427', 'RSM3528', 'RSM4255', 'RSM0312', 'RSM1201', 'RSM0152', 'RSM1305', 'RSM3768', 'RSM2657', 'RSM0390', 'RSM0685', 'RSM3047', 'RSM1996', 'RSM4444', 'RSM1492', 'RSM1191', 'RSM0104', 'RSM3753', 'RSM1781', 'RSM2007', 'RSM3568', 'RSM4254', 'RSM4346', 'RSM3362', 'RSM2932', 'RSM0149', 'RSM1867', 'RSM2532', 'RSM3909', 'RSM0558', 'RSM3860', 'RSM4167', 'RSM2699', 'RSM0990', 'RSM2515', 'RSM3451', 'RSM0338', 'RSM4551', 'RSM0142', 'RSM0545', 'RSM3871', 'RSM2626', 'RSM1724', 'RSM4199', 'RSM4530', 'RSM0123', 'RSM0363', 'RSM1460', 'RSM1071', 'RSM1420', 'RSM2324', 'RSM2631', 'RSM1363', 'RSM0456', 'RSM0250', 'RSM3305', 'RSM0211', 'RSM0092', 'RSM0753', 'RSM0929'

Manual inspection reveals many Ag-N compounds, which look "strange" in the ball-stick-view in VESTA (this is probably also often due to charge issues, e.g. RSM0929 or RSM0092 - in latter, it would have been also easy to find out from the original file in the CSD as there is perchlorate. Also DUWRIY misses a charge. RSM0456 is also wrong - the CSD entry has a charged carboxy in the linker). Others certainly have floating molecules, e.g. RSM0149, RSM1706 is also funny, RSM3853 also looks weird after optimization (probably due to charge issues)

Overall, there might be some false positives but all structures I opened were not trivially correct to me on first glance.

ElMouba commented 1 year ago

The RSM folder contains 3400 structures maybe and flagging around 10% of the structures is still good right ? It is like 90% of the structures are still in "good shape". I need to check which of those structures we have used so far, because we didn't generate the isotherms for all RSM structures yet

kjappelbaum commented 1 year ago

well, I have only manually checked two types of checks so far. If you look at all flags there are 447 in 3152 I ran this on. And "good" and "bad" is in the eye of the beholder :) I use this issue to go over the remaining checks and to see if we can expect many false positives there.

ElMouba commented 1 year ago

Ohhh I see. Let me then help with the rest and see what we get

kjappelbaum commented 1 year ago

Manually identified charged MOFs