open-reaction-database / ord-data

Official data repository for the Open Reaction Database
https://open-reaction-database.org
Creative Commons Attribution Share Alike 4.0 International
219 stars 55 forks source link

Add hein-dataset.pbtxt #86

Closed skearnes closed 3 years ago

skearnes commented 3 years ago

Multi-step reactions from https://pubs.rsc.org/en/content/articlelanding/2019/re/c9re00086k. Measurements of the reaction composition were taken over time and are each encoded as a separate outcome. I used two identical analyses to distinguish between online and offline measurements.

Supporting files: hein-submission.zip (updated)

skearnes commented 3 years ago

Thanks Connor for the careful review! I made many of the requested changes and added comments below for the others.

  • [ ] - Synthesis of intermediate 2 looks great; it’s always fun to see when the full level of detail can be captured by the schema. I infer that the yield was measured by isolated weight (although the weight isn’t reported) — perhaps it is worth adding a WEIGHT analysis?

How much do we want to add inferred steps that are not explicit in the methods? I'm happy to do this but I want to make sure there's value in having an empty WEIGHT section.

  • [ ] - For the saturated NaCl solution, I assumed you just looked up the solubility? I’ve done that before, too. This is one area where perhaps the requirement to define amounts works against us; this isn’t an uncommon scenario. I don’t have a recommendation for this PR, but something for us to consider in general

Yes, I just looked it up; I agree it's not ideal and it might be better to do something like describe it in the details, but then we run into the validations again with missing amounts...

  • [ ] - The SI does not seem to say how they isolated 3-(E) from 3-(Z) and/or how the yields were individually quantified. This would be a nice detail to add if we knew.

Agreed; I don't see anything there. There might be some info in https://pubmed.ncbi.nlm.nih.gov/26743694/ (cited in footnote 1 of the SI) but that's behind another paywall for me and not in their SI.

  • [ ] - I would have had a preference for reporting amounts as masses rather than moles, but I won’t make you change all of them. For the future, it’s much more natural. e.g., for Procedure 1, 13 mol versus 2.17 mg (presumably the mass was how the experiment was performed)

Thanks; will keep in mind for the future.

  • [ ] - I was contemplating whether we should make “Concentration” a structured measurement, but I think the answer is probably not. It’s not like concentration is what is measured directly; it is just a derived property. It’s also very rarely what is measured in these analyses, except for kinetic profiling studies. We might want to discuss this further.

Agreed.

skearnes commented 3 years ago

Here's the updated notebook: hein-submission.zip