Closed ml-evs closed 10 months ago
@ardunn are you still interested in setting up these automated Zenodo archives? Having this file would be a start (the metadata for the first release would be edited versus this file, then we can update it with all benchmark authors going forward). Unfortunately Zenodo archives can currently only have 1 editor (for now: https://github.com/zenodo/zenodo/issues/810).
@ardunn are you still interested in setting up these automated Zenodo archives? Having this file would be a start (the metadata for the first release would be edited versus this file, then we can update it with all benchmark authors going forward). Unfortunately Zenodo archives can currently only have 1 editor (for now: https://github.com/zenodo/zenodo/issues/810).
Hey @ml-evs ! Thanks for the work on this!
Could you explain to me what the advantages of this are compared to just telling people to cite the original paper?
Would it be like, when we release updated or new benchmarks (which we are looking into right now for generative learning and adaptive design) the citations for each would be better organized by version?
I'm just trying to briefly weigh the maintenance effort of adding another account vs the reward of having the citations more neatly accounted for. I'm not against merging this in at all, just wanted a bit more to clarify the advantages
@ardunn are you still interested in setting up these automated Zenodo archives? Having this file would be a start (the metadata for the first release would be edited versus this file, then we can update it with all benchmark authors going forward). Unfortunately Zenodo archives can currently only have 1 editor (for now: zenodo/zenodo#810).
Hey @ml-evs ! Thanks for the work on this!
Could you explain to me what the advantages of this are compared to just telling people to cite the original paper?
Couple of advantages off the top of my head, as discussed in #156. People can cite versions of the leaderboard, so when they make some comparison between algorithms you can immediately see what was on the leaderboard at that time (and how out-of-date the citation is). I've seen a few papers that use matbench by just comparing to the initial published "leaderboard" (before we even discussed setting up this web version!) It should also incentivize people to add benchmarks pre-publication, as they can then cite the benchmark that includes their own data.
Would it be like, when we release updated or new benchmarks (which we are looking into right now for generative learning and adaptive design) the citations for each would be better organized by version?
Yep, as the benchmarks themselves evolve this would also capture that.
I'm just trying to briefly weigh the maintenance effort of adding another account vs the reward of having the citations more neatly accounted for. I'm not against merging this in at all, just wanted a bit more to clarify the advantages
Understood! Once it is set up, it should be pretty painless. Could easily forgo the part of adjusting the citation file as new models are submitted, at which point it just becomes an automated archival of this repo to Zenodo every time you hit release. Happy to chip in with maintenance and future dev if there are other volunteers but understand if this project isn't a priority anymore!
@ml-evs Actually, I am tentatively handing this project off to @hrushikesh-s , a new graduate student in our group. He's looking to work on some new methods of benchmarking alternative paradigms of materials discovery such as adaptive design and generative models, but he's also getting up to speed with maintaining the repo... so this PR might be a useful way for him to dip his toes in. @hrushikesh-s , do you have any thoughts on this?
re: #156.
Just added @ardunn as the author for this repo now. Zenodo has a separate "contributors" field but not sure how this maps to the CFF schema (probably just as additional
authors
- not sure how you would feel about that!).The main goal from #156 is to automatically archive releases to Zenodo. This file will automatically be used by Zenodo to populate the metadata, otherwise for each release, Zenodo will scrape the repo metadata and include all contributors as authors, and will also miss the related DOIs of the paper, and any additional info you want to provide (ORCID, custom titles etc.).
Outstanding issue: there's no satisfactory way to update the CITATION.cff with the latest versioned Zenodo DOI. Can either just use the "concept DOI" for all versions in the citation file, then describe how to cite the versioned DOI in the text, or try to set up some additional procedure that would use the Zenodo API to add one commit after release that bumps the versioned DOI.