Open eduardszoecs opened 9 years ago
Feasibility. There is no API. Scraping not explicitly disallowed, although some features (e.g. mass spectra) are available in paid software I belive, so we might not want to scrape everything. Most data is presented in table form making scraping somewhat easy.
Scope. There is a ton of information, but most is experimental chemical properties with citations. Not all datasets exist for all compounds. Examples of properties:
nist_ri()
is already implemented. The reaction search is probably the most unique thing, so that might be something to work on next?From the traceability perspective it's great that they have references to original publications. Ideally every single number in every database should be traceable to the original publication.
I agree to treat each property separately, we can always create an intergrator function later to reduce the number of exported functions. Reactions (similarly to QSAR models discussed earlier) open up the scope of the package quite a bit, I am ok with it, @andschar what do you think?
Given that there is no API, I think it would be best to ask for explicit approval to be safe. I will contact NIST about this.
http://webbook.nist.gov/chemistry/cas-ser.html