ropensci / webchem

Chemical Information from the Web
https://docs.ropensci.org/webchem
Other
160 stars 40 forks source link

nist #29

Open eduardszoecs opened 9 years ago

eduardszoecs commented 9 years ago

http://webbook.nist.gov/chemistry/cas-ser.html

eduardszoecs commented 6 years ago

Related https://github.com/ropensci/webchem/pull/154

Aariq commented 4 years ago
  1. Feasibility. There is no API. Scraping not explicitly disallowed, although some features (e.g. mass spectra) are available in paid software I belive, so we might not want to scrape everything. Most data is presented in table form making scraping somewhat easy.

  2. Scope. There is a ton of information, but most is experimental chemical properties with citations. Not all datasets exist for all compounds. Examples of properties:

  1. Overlap. Certainly provides at least some properties not found in other databases in webchem. My suggestion would be to treat each property type as an individual database and not try to integrate more than one feature at a time. nist_ri() is already implemented. The reaction search is probably the most unique thing, so that might be something to work on next?
stitam commented 4 years ago

From the traceability perspective it's great that they have references to original publications. Ideally every single number in every database should be traceable to the original publication.

I agree to treat each property separately, we can always create an intergrator function later to reduce the number of exported functions. Reactions (similarly to QSAR models discussed earlier) open up the scope of the package quite a bit, I am ok with it, @andschar what do you think?

Given that there is no API, I think it would be best to ask for explicit approval to be safe. I will contact NIST about this.