open-reaction-database / ord-data

Official data repository for the Open Reaction Database
https://open-reaction-database.org
Creative Commons Attribution Share Alike 4.0 International
233 stars 59 forks source link

Uploading dataset pbtxt for 24Golden paper Fig 7 #194

Closed spencerheins closed 3 months ago

spencerheins commented 3 months ago

This is an enumerated dataset of 288 reactions from the paper https://doi.org/10.1016/j.chempr.2024.04.001 (Figure 7) created using the interactive web editor. Template reaction pbtxt and css are attached.

Archive.zip

bdeadman commented 3 months ago

Thanks @spencerheins. I'll review this and get back to you tonight or tomorrow.

bdeadman commented 3 months ago

Hi @spencerheins This is a well put together dataset. I've made a few minor changes (see attached) which I will summarise below. Please move the replacement dataset to inside the data folder and I can then approve the merge.

spencerheins.zip

Changes made:

Note that I couldn't access the paper or the SI since they are behind a paywall. The dataset is however consistent with the abstract and my general synthesis knowledge. If you want me to provide a more thorough check then please send me (by email) a copy of the paper and/or SI for reference.

spencerheins commented 3 months ago

Hi @bdeadman,

Thanks for reviewing the data. Your changes sound good to me. Can you clarify what you mean by moving the replacement dataset to inside the data folder? I am unsure how to do this. Is this another pull request, or do I place the zip file into the data folder under this new branch?

Thank you! Spencer

bdeadman commented 3 months ago

Hi @spencerheins. You need to move the .pbtxt dataset file into the ord-data/data folder in the spencerheins:24Golden_Fig7_submission branch. The rest of the contents in the zipped folder do not get included in the ord-data repo, but we include them in the pull request commentary here.

I'm a bit unsure what happens if you move this file while you have this pull request open, but keen to find out. Once the new dataset is in, and the old dataset out, that commit will show up in your GitHub account and I think you should be able to pull it through as an update to this pull request. If it helps we can have a quick call to solve it together.

Alternatively, you or I can cancel this pull request, and make a new branch and pull request. If you are busy I can get it done from my fork of ord-data, or if you give me some edit rights on your ord-data fork then I could help pull it through from there.

Also, can you please edit your comment above to remove the folder with the paper and SI. I've got it now, but we probably shouldn't store a copy of copyrighted material here in a public forum.

bdeadman commented 3 months ago

@spencerheins don't worry about the paper and SI. I was able to remove it from my account.

skearnes commented 3 months ago

Moving files around is fine, and it doesn't matter which folder you put a new dataset in: everything is diffed against the main branch after each commit.

On Tue, Jul 30, 2024, 8:03 PM Ben Deadman @.***> wrote:

@spencerheins https://github.com/spencerheins don't worry about the paper and SI. I was able to remove it from my account.

— Reply to this email directly, view it on GitHub https://github.com/open-reaction-database/ord-data/pull/194#issuecomment-2259391245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHITGLY3PZDGTRZ74TAXJLZPASTTAVCNFSM6AAAAABLUQ5TOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJZGM4TCMRUGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

spencerheins commented 3 months ago

@bdeadman - the new .pbtxt file has been deposited into the ord-data/data folder in the spencerheins:24Golden_Fig7_submission branch and a pull request created. Please let me know if this is correct (or not).

Thank you.

bdeadman commented 3 months ago

@spencerheins I can confirm that the new .pbtxt file is in this pull request. We'll need you to delete your old dataset file as well.

spencerheins commented 3 months ago

Thanks @bdeadman. Old dataset is deleted (sorry I missed that). Thanks for all the assistance.

bdeadman commented 3 months ago

No problem @spencerheins we want your data so assistance is limitless!

However, I can still see the dataset in your branch. I think it may not have been committed after you deleted the file. I expect the commit list to show a 3rd entry when this change is committed.

bdeadman commented 3 months ago

Thanks @spencerheins. The dataset has been merged into a branch on ORD, and is now going through the automated workflows to prepare it for merge into main. You can follow the progress on #197.

Once the dataset is in the main branch, we'll still wait until it appears in the online browser. At that point we'll announce it on LinkedIn.

spencerheins commented 3 months ago

Great to hear @bdeadman ! Will "#197" be the official ORD submission number?

bdeadman commented 3 months ago

Depends how you want to use 'submission number'. A more appropriate label might be the dataset id and location which will be data/a1/ord_dataset-a12fa15d036d489c971b0b514caeae52.pb.gz

When it goes live on the ORD online browser (not immediate) then you will be able to search for it with ord_dataset-a12fa15d036d489c971b0b514caeae52