Closed editorialbot closed 4 months ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Software report:
github.com/AlDanial/cloc v 1.88 T=0.08 s (755.2 files/s, 365581.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 26 704 1254 1847
Jupyter Notebook 5 0 21769 1194
Markdown 12 208 0 679
TeX 6 28 0 378
YAML 5 21 72 209
INI 1 11 0 83
Dockerfile 1 15 23 26
make 1 6 8 15
TOML 1 1 3 5
JSON 1 0 0 1
-------------------------------------------------------------------------------
SUM: 59 994 23129 4437
-------------------------------------------------------------------------------
gitinspector failed to run statistical information for the repository
Wordcount for paper.md
is 1122
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.26434/chemrxiv-2022-6l4pm is OK
- 10.1038/s41467-019-10030-5 is OK
- 10.21105/joss.04528 is OK
- 10.1021/acs.jcim.8b00839 is OK
- 10.1038/s43588-022-00349-3 is OK
- 10.1038/s41524-020-00406-3 is OK
- 10.1063/1.4812323 is OK
- 10.1016/j.commatsci.2012.10.028 is OK
- 10.1038/s41598-022-08413-8 is OK
- 10.3389/fphar.2020.565644 is OK
- 10.1016/j.matt.2021.11.032 is OK
- 10.1107/S2056989019016244 is OK
- 10.1038/s41586-019-1335-8 is OK
- 10.1002/advs.202100566 is OK
MISSING DOIs
- None
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@jamesrhester, @ml-evs, @mkhorton, thanks again for reviewing! Just checking in on your review. Please let me know if you have any questions about the process. Feel free to create issues in project repository directly or write them down as comments here, but please do link the issues in this review so it's easy to follow for everyone. @mkhorton, please go ahead and create your checklist first using the command @editorialbot generate my checklist
.
:wave: @jamesrhester , @mkhorton, please update us on how it's going with your reviews when you find the time
Appreciate the reminder @phibeck, thank you -- on my radar, can't believe it's almost been a month already since agreeing to review!
Likewise!
Thank you all for getting the review started! As you work through your checklists, please feel free to comment and ask questions in this thread. You are encouraged to create issues in the repository directly. When you do, please mention openjournals/joss-reviews#5618
so that it creates a link in this thread and we can keep track of it.
Please let me know if you have any questions or if either of you requires some more time.
Hi @phibeck, running through the checklist. Unfortunately I may have a conflict of interest: I have previously been on publications with the second author, Joseph Montoya. We worked together in the same research group until 2018 (i.e. outside the four year window in the COI policy), but the most recent paper we were on together actually only came out in 2021.
If the work was done more then 4 years, but the paper appeared later, this is not a conflict for JOSS. The four years is about the collaborative relationship itself.
Thank you, @danielskatz, for clarifying this question. Sounds like you don't have a COI here, @mkhorton.
Just a heads-up (mostly for @phibeck) that I will restart my review on this, and will continue collecting small things in my old issue at https://github.com/sparks-baird/matbench-genmetrics/issues/80 which wasn't previously linked here.
@jamesrhester, @mkhorton & @ml-evs - could you provide an update on the progress of your review? Thank you!
Getting onto this now. We've just had a big crystallography meeting that took up a lot of my cycles...
As a crystallographer but non ML specialist, I found the "Statement of Need" lacked context. Line 22 in the paper made no sense to me, which is not good for the first sentence. I would therefore like one or two further sentences added at the beginning explaining how ML uses benchmarks, e.g "in ML, the result of a prediction is evaluated using benchmarks, which are then used to adjust the ML weights. Typically, a crystal structure ML model has used benchmarks from...." (which might show I have no idea what I'm talking about). I think this will better help readers to quickly determine whether or not the paper is relevant to them.
Once this is done I'm ready to sign off on my review.
Great, thanks for the update, @jamesrhester. @sgbaird, feel free to get started working on the comments and issues linked here by @jamesrhester and @ml-evs. Please update us here in this issue about the progress so we can keep track of the changes.
👋 @sgbaird could you let us know where you stand with responding to the comments of the reviewers?
@mkhorton and @ml-evs, it would be great to get an update from your side as well regarding the remaining points on your checklists. Thank you!
Hi @phibeck, I have been continuing to list my comments at https://github.com/sparks-baird/matbench-genmetrics/issues/80. My main concern as raised by these comments is that while "complete" the library does not appear to have been "polished", there are lots of minor errors in the docs and defunct code paths/duplicated template code that seems to be leftover from project initialization (I can only find around 700 lines in total of active code in the main package). Coupled with the fact that there are no examples of "serious"/comparative benchmarks, i.e., benchmarking a real generative algorithm, I am struggling to extrapolate the package performance/functionality from the simple example provided. If my comments in https://github.com/sparks-baird/matbench-genmetrics/issues/80 could be addressed I would have an easier time recommending this for publication, but until then I find it hard to judge.
Just to attempt to give a constructive suggestion, since actually performing generative ML is out of scope for JOSS, perhaps a nice self-referential example would be to use the MP time split subpackage to "benchmark" our last n years of progress in materials discovery as a field, i.e., treat the slow march of time as a generative model. With a bit of fudging (depending on how the time split treats duplicates in the ICSD) I think this can test novelty, uniqueness and validity with more interpretable results than the given random model. Now to just resist trying this out myself!
All, thank you for working on this!
@jamesrhester thank you for the comment - I will update to clarify.
@ml-evs you raise some great points. I will work on addressing your comments before you dive deeper. I return from an international trip tomorrow, and I'll try to come back to this next week. I really like the idea of the "slow march of time" test 😁
:wave: @sgbaird did you have time to look into the issues?
:wave: @mkhorton how is your review going? Please let me know if you need more time.
@sgbaird could you please provide an update when you plan to address the reviewers comments?
Hi @phibeck, I transitioned from a research role to an administrative, so I've been very delayed in getting to this, but it's still been on my mind a lot. I will try to find some time soon.
Hi @sgbaird, okay sounds good, thanks. Keep us updated.
@phibeck responding to above, I'd like to request more time if ok
@mkhorton okay sure, I will set a reminder for 10 days from now
@editorialbot remind @mkhorton in ten days
Reminder set for @mkhorton in ten days
:wave: @mkhorton, please update us on how your review is going (this is an automated reminder).
:wave: Hi @sgbaird could we please get an update regarding the issues?
Hi @sgbaird, can you please let me know if you are able to address the issues and comments within the next two weeks?
If you are not able to within that time, we will need to place the submission on hold, so that we can address the substantial number of submissions that need attention. Should you not be able to work on this in the foreseeable future, we will need to withdraw it.
Hi @phibeck, sorry for such a delay.
@jamesrhester
As a crystallographer but non ML specialist, I found the "Statement of Need" lacked context. Line 22 in the paper made no sense to me, which is not good for the first sentence. I would therefore like one or two further sentences added at the beginning explaining how ML uses benchmarks, e.g "in ML, the result of a prediction is evaluated using benchmarks, which are then used to adjust the ML weights. Typically, a crystal structure ML model has used benchmarks from...." (which might show I have no idea what I'm talking about). I think this will better help readers to quickly determine whether or not the paper is relevant to them.
Once this is done I'm ready to sign off on my review.
I have updated this section in the text. Please let me know if this looks OK to you now!
@ml-evs I have addressed all but one of the points in https://github.com/sparks-baird/matbench-genmetrics/issues/80. @hasan-sayeed is going to help me with testing out your "slow march of time" suggestion in https://github.com/openjournals/joss-reviews/issues/5618#issuecomment-1728516279, which I've made some preparations for.
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
I'm happy with the updated text, I will sign off on the review now.
I'm happy to sign off on the text too, I'm just checking my remaining review items. This package is motivated by an important need in the community; I am not wholly satisfied that this solution is going to be sufficient as it stands today, but it's a good start and can be improved over time. The package also seems well thought through and structured (my compliments to the authors).
My apologies also for the delay with the review; I will be removing myself from the active reviewer list for a while since I don't want to normalize this sort of wait!
To give additional context for my critical comment, it is not clear to me that (a) space group alone is sufficient to assess validity, or (b) what a good baseline distribution should be for space group. As far as I can tell, assessing the validity of a generated structure is an unsolved challenge and an active research need. Using the Materials Project as a baseline is a good start, but we might need to be more creative to find more comprehensive answers, and other people might reasonably prefer to use different baselines.
For this package to take off, I would also perhaps start actively soliciting suggestions for additional metrics to include. The desire to prevent metric hacking is also very important, and I hope to see more robust solutions here. The idea of including ML based pre-filters is useful, but also itself dependent on the quality of the ML filter: it could steer an otherwise good model awry, depending on implementation.
Some other points:
sphinx-prompt
plugin, to make your examples easier to copy-and-pasteMPTSMetrics10
available to import from the __init__.py
, I believe the example should look like:
from tqdm import tqdm
from matbench_genmetrics.mp_time_split.utils.gen import DummyGenerator
from matbench_genmetrics.core.metrics import MPTSMetrics10, MPTSMetrics100, MPTSMetrics1000, MPTSMetrics10000
mptm = MPTSMetrics10(dummy=True)
for fold in mptm.folds:
train_val_inputs = mptm.get_train_and_val_data(fold)
dg = DummyGenerator()
dg.fit(train_val_inputs)
gen_structures = dg.gen(n=mptm.num_gen)
mptm.evaluate_and_record(fold, gen_structures)
Unfortunately, mptm.evaluate_and_record(fold, gen_structures)
seems to return None
when I ran this, after printing information on ElementProperty
and SiteStatsFingerprint
.
pip
in a clean conda environment: I managed to install it in the end, but a pure pip install does not seem possible on the machine I tested it on (macOS 13.1, Python 3.9).If we can resolve the installation issue and double check and verify the code examples in the docs I'd be happy to sign off.
Regarding the example not working, I raised an issue a while back here
Thank you @sgbaird for the update, and thanks @jamesrhester and @mkhorton for keeping the momentum going. It sounds like there's a fairly well-defined list of items for @sgbaird to work through before the reviewers recommend acceptance. It would be great if the issues could be addressed within the next three weeks, otherwise please provide an estimate of how much time is needed.
I'm now happy with the state of the repository itself and the issues raised at https://github.com/sparks-baird/matbench-genmetrics/issues/80, with the one exception that I think the package still needs a good motivating example (as per my previous suggestion) -- I think this is already being worked on (and also ties in with what Matt (1) has said above). Another (non-blocking) suggestion (perhaps an easier one than using mp_time_split itself) would be to take the recently published Gnome dataset and compare that to MP via these metrics.
Pending this example case, the text is also good with me, modulo two minor comments
matbench-discovery
package has also been released (with preprint). I think these two packages nicely complement each other, but I can see people getting confused (similar names, similar aims). Perhaps a sentence or two highlighting the differences would be helpful here?Thanks @ml-evs for the update! @sgbaird whenever you have a chance to take another look, please do, and let us know if there's anything else stopping you from wrapping things up.
@sgbaird I tried to contact you via the email address listed in your profile. Could you let us know what the status is? In particular, whether you think you will be able to finish working through the issues raised by the reviewers within the next 4 weeks. Thank you.
@phibeck sorry for the radio silence, @hasan-sayeed and I are actively working on the motivating example mentioned. Yes, that timeline is doable for us.
Thank you for the update @sgbaird. Sounds good, please keep us posted.
@ml-evs @sgbaird I've just added the motivating example mentioned above as a notebook file. This script treats materials discovered in the future as generated data and benchmarks the progress of materials discovery over time. It employs five-folds, each fold partitions the dataset into distinct training, validation, and generated sets using mp_time_split
subpackage, simulating the incremental advancement of materials science. The result we got is as follows:
{0: {'coverage': 0.03280207561156412,
'novelty': 0.9571905114899926,
'uniqueness': 0.9998874668429091,
'validity': 0.9657295385947532},
1: {'coverage': 0.032987398072646404,
'novelty': 0.948295033358043,
'uniqueness': 0.9999187947547732,
'validity': 0.9689630506461338},
2: {'coverage': 0.019644180874722018,
'novelty': 0.9529280948851001,
'uniqueness': 0.999984404745629,
'validity': 0.942820170335924},
3: {'coverage': 0.02575982209043736,
'novelty': 0.939770200148258,
'uniqueness': 0.9999801452488405,
'validity': 0.8635418248589397}}
Submitting author: !--author-handle-->@sgbaird<!--end-author-handle-- (Sterling Baird) Repository: https://github.com/sparks-baird/matbench-genmetrics Branch with paper.md (empty if default branch): Version: v0.6.5 Editor: !--editor-->@phibeck<!--end-editor-- Reviewers: @ml-evs, @mkhorton, @jamesrhester Archive: 10.5281/zenodo.10840604
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@ml-evs & @mkhorton & @jamesrhester, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @phibeck know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @ml-evs
📝 Checklist for @jamesrhester
📝 Checklist for @mkhorton