Materials-Consortia / OPTIMADE

Specification of a common REST API for access to materials databases
Creative Commons Attribution 4.0 International
72 stars 37 forks source link

Reaching out to other databases #124

Open rartino opened 5 years ago

rartino commented 5 years ago

from @ml-evs: Please add a comment with any new suggestions, but also feel free to edit this table

Database Contacted Positive response?
The Electronic Structure Project :negative_squared_cross_mark: :question:
NREL MatDb :heavy_check_mark: :question:
SUNCAT Catalysis Hub :negative_squared_cross_mark: :question:
Computational Materials Repository (CMR) :heavy_check_mark: :heavy_check_mark:
High-throughput Experimental Materials Database (HTEM) :heavy_check_mark: :heavy_check_mark:
MatNavi :negative_squared_cross_mark: :question:
Clean Energy Project (CEPDB) (now offline) :negative_squared_cross_mark: :question:
Toplogical Quantum Chemistry Database :heavy_check_mark: :question:
Organic Materials Database (OMDB) :heavy_check_mark: :question:
JARVIS :heavy_check_mark:
Hybrid3 :heavy_check_mark: :heavy_check_mark:
2DMatpedia :heavy_check_mark:
QMOF :heavy_check_mark: :heavy_check_mark:
Open Catalyst Project :heavy_check_mark: :question:
Carolina MatDB & :heavy_check_mark: :question:
AMCSD :heavy_check_mark: :question:
CMU Alloy Database :heavy_check_mark: :question:
f-electron structure database :negative_squared_cross_mark: :question:
Matgen :heavy_check_mark: :question:
MIP-3d :negative_squared_cross_mark: :question:
OCELOT :heavy_check_mark: :question:
Hypothetical Zeolite Database :heavy_check_mark:
Matterverse :heavy_check_mark: :heavy_check_mark:
~Google Brain's top secret DFT database~ Deepmind's Gnome :heavy_check_mark:
(currently hosted at
ditto Microsoft Research :heavy_check_mark:
Toyota's CAMD dataset :heavy_check_mark:
(hosted at
Alexandria from Miguel Marques' group :heavy_check_mark:
(hosted at
GW-BSE database :heavy_check_mark: :negative_squared_cross_mark:

From discussions with @ctoher; we thought that it would be good to collect somewhere a list of other databases to reach out to for checking if they are interested in implementing OPTiMaDe.

Here are a list (some of these were pointed out to me by Lauri Himanen at Aalto University.)

More data-set oriented

More experimentally oriented

Unknown content or status

dwinston commented 5 years ago

Topological Materials Database, 2017-2019.

blokhin commented 5 years ago

ml-evs commented 4 years ago

Organic Materials Database

ml-evs commented 4 years ago


ml-evs commented 4 years ago

I've just come across Hydrid3 on twitter, a database for organic/inorganic perovskites out of Duke, will mention OPTIMADE to them.

ml-evs commented 3 years ago

Carolina Materials Database, a new (2020?) database of ternary & quaternary crystal structures predicted with a generative neural net.

Zhao et al. "High-throughput discovery of novel cubic crystal materials using deep generative neural networks" arXiv:2102.01880

ml-evs commented 3 years ago

Open Catalyst Project from CMU & Facebook AI. Big datasets of molecule+surface guided relaxations & MD. Could be a useful case study when considering adding trajectories...

Chanussot et al., "The Open Catalyst 2020 (OC20) Dataset and Community Challenges" arXiv:2010.09990

ml-evs commented 3 years ago

Quantum MOF database containing ~15,000 electronic structure calculations on MOFs, currently provided as an archive on figshare. This might be an example of the kind of dataset we discussed in the last meeting, where hosting a public API is prohibited by technical or resource constraints. Dare we go down the OPTIMADE-as-a-service route? :grin:

Rosen et al., "Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery", Matter (2021) 10.1016/j.matt.2021.02.015

blokhin commented 3 years ago

DFT+U for transition metal rutile dioxides - some solid amount of the raw calculations simply hosted at Github.

(I'm putting it here to raise a discussion, what are the "other databases", also in connection to the Optimade-as-a-service mentioned by @ml-evs)

ml-evs commented 3 years ago

(I'm putting it here to raise a discussion, what are the "other databases", also in connection to the Optimade-as-a-service mentioned by @ml-evs)

An interesting point that @shyamd made in the past was "what happens if all these databases suddenly get the same load as the Materials Project"? This could be a useful way of framing the discussion at the next workshop. I think we really have to consider the science case we imagine for querying all OPTIMADE instances at once.

ml-evs commented 3 years ago

As I mentioned in the last meeting, I like the idea of a datasette-like tool for OPTIMADE. that allows for local exploration and filtering of materials datasets without needing the data owner to provide an API (they would of course need to provide the data in a supported format). Would be fun to hack on something like this at the workshop. You can then bring your favourite local OPTIMADE client to the show for rich filtering. This could then naturally become the tool for data providers to do one-click OPTIMADE provider registration (or even deployment?), though not sure that would be worthwhile yet...

ml-evs commented 3 years ago

I just came across Matgen, a set of databases from a large Chinese academic consortium, via this paper:

He, B., Chi, S., Ye, A. et al. High-throughput screening platform for solid electrolytes combining hierarchical ion-transport prediction algorithms. Sci Data 7, 151 (2020). 10.1038/s41597-020-0474-y

ml-evs commented 3 years ago

CMU Alloy Database out of the group of Prof Michael Widom, last update to the website/data was 2011, but it is still alive. This could be a great candidate for quickly spinning up an OPTIMADE API.

ml-evs commented 2 years ago

American Mineralogist Crystal Structure Database (down at the moment, but available via Wayback Machine). Crystal structures of all minerals published across various mineralogy journals, grouped by mineral name.

ml-evs commented 2 years ago

f-electron structure database (FESD). Contains LAPW DFT calcs on known lanthanide/actinide-containing crystal structures, possibly also novel/predicted structures (preprint from 2017 says available soon), and possibly also DFT+DMFT calculations.

blokhin commented 2 years ago

American Mineralogist Crystal Structure Database

Somehow related to RRUFF project presumably

ml-evs commented 2 years ago

MIP-3d: database focused on thermoelectric properties of known structures

Yao, M., Wang, Y., Li, X. et al. Materials informatics platform with three dimensional structures, workflow and thermoelectric applications. Sci Data 8, 236 (2021).

merkys commented 2 years ago

American Mineralogist Crystal Structure Database (down at the moment, but available via Wayback Machine). Crystal structures of all minerals published across various mineralogy journals, grouped by mineral name.

COD ingests AMCSD from time to time.

ml-evs commented 2 years ago

Organic Crystals in Electronic and Light-Oriented Technologies (OCELOT)

ml-evs commented 2 years ago (website currently down but there is a preprint: " A Materials Informatics Web App Platform for Materials Discovery and Survey of State-of-the-Art" arXiv 2109.04007) (overlapping devs with Carolina MatDB above)

ml-evs commented 2 years ago

As this list grows, I think it makes sense to collect a table of who we have actually contacted in the top comment... here is a draft below, feel free to suggest changes if you know that this database knows about OPTIMADE/is interested. I have ticked the ones that I have attended workshops or that I have contacted personally.

(moved to top comment)

ml-evs commented 2 years ago

ACCDB - think this has been mentioned in the past but I couldn't find it. A large number of static databases of crystal/molecular geometries used for benchmarking new methods. Could be a nice test bed for both necroptimade and the new properties format.

ml-evs commented 1 year ago

Hypothetical Zeolite Database: 4,450,542 zeolite structures with energies and topologies, going back 30+ years - not sure how much longer it will last so we should try to help out!

ml-evs commented 1 year ago

Have we been in touch with the Open ForceField consortium as part of the trajectories work? Looking in particular at project 2, which could be a good collaboration (though maybe only the COD data is relevant to them!)

ml-evs commented 1 year ago

merkys commented 1 year ago

PubChem, database of chemical molecules.

ml-evs commented 1 year ago

I've just updated the table with a few more db's I've contacted this year, if anyone has contacts at any of the remaining :question: marks, please feel free to reach out to them...

JPBergsma commented 11 months ago

Perhaps this is still an interesting database

ml-evs commented 10 months ago

FFMDFPA: A FAIRification Framework for Materials Data with No-Code Flexible Semi-Structured Parser and Application Programming Interfaces: 10.1021/acs.jcim.3c00836 -- has developed a similar grammar and overlapping functionality with optimade-python-tools, but focused on VASP/Gromacs outputs specifically (I think) -- discusses future integration with OPTIMADE so we should make sure they get invited!

JPBergsma commented 7 months ago

Perhaps the Database of Zeolite Structures could also be added to this list. This database provides structural information on all the Zeolite Framework Types that have been approved by the Structure Commission of the International Zeolite Association (IZA-SC).

It is searchable and includes: descriptions and drawings of each framework type crystallographic data and simulated powder diffraction patterns for representative materials relevant references detailed instructions for building models measured powder patterns from "Verified Syntheses" (2nd and 3rd edition) 29Si MAS NMR spectra for pure silica and aluminosilicate zeolites 31P MAS NMR spectra for pure aluminophosphate zeolites framework chemical composition for all materials in the database

ml-evs commented 3 months ago

Small database of materials properties computed (at DFT level?) with quantum computing algorithms (i guess the most interesting "properties" here are the QC architectures rather than materials properties)