Closed editorialbot closed 1 year ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Software report:
github.com/AlDanial/cloc v 1.88 T=0.13 s (595.8 files/s, 82150.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 71 1791 3289 4485
Markdown 4 247 0 986
TeX 1 11 0 131
YAML 2 7 3 41
TOML 1 5 0 31
INI 1 0 0 4
-------------------------------------------------------------------------------
SUM: 80 2061 3292 5678
-------------------------------------------------------------------------------
gitinspector failed to run statistical information for the repository
Wordcount for paper.md
is 1094
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1088/0953-8984/26/36/363202 is OK
- 10.1103/physrevb.97.161105 is OK
- 10.1103/PhysRevB.94.035118 is OK
- 10.1126/science.aad3000 is OK
- 10.1088/1361-648X/aa680e is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.48550/arXiv.2209.12747 is OK
- 10.1063/1.4812323 is OK
- 10.1016/j.commatsci.2012.02.005 is OK
- 10.1088/2515-7639/ab13bb is OK
MISSING DOIs
- None
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@fjmartinmartinez @hmacdope Hey, thanks for reviewing. There's a command above which will generate a checklist that the review must cover. Anything that needs addressing is probably best raised as an issue on the code repository. Give me a ping if you get stuck or have questions at any time
In readme.md line 105 it seems to be a to-do task. Is this addressed? Should I rise this as an issue in the code?
In the paper it is stated that "excitingtools (plus dependencies) is significantly more lightweight than the NOMAD package, approximately two orders of magnitude smaller". Maybe some comment on what do you mean by "lightweight" would be good. I am assuming it means disk space or RAM memory?
Suggestion for improvement (just as ideas):
@fjmartinmartinez thanks for the review. I'll verbally address the points here and an open a PR from this issue to make the proposed changes:
In readme.md line 105 it seems to be a to-do task. Is this addressed? Should I rise this as an issue in the code?
excitingtools has been calved off from a bigger repository and the associated issue was being tracked in that repo's issue tracker. As it's not meaningful here, and pdoc does a perfectly good job at generating documentation, I'll remove it.
In the paper it is stated that "excitingtools (plus dependencies) is significantly more lightweight than the NOMAD package, approximately two orders of magnitude smaller". Maybe some comment on what do you mean by "lightweight" would be good. I am assuming it means disk space or RAM memory?
I was referring to package size, but after making some effort to double-check this, I believe I am just factually incorrect. I can't straightforwardly verify the larger packages as pip install nomad-lab[parsing]
and pip install nomad-lab[all]
fail to install on both my mac M1 ( Python 3.9.16) and Debian systems (Python 3.7.3) for various reasons.
However, the minimal nomad-lab package, which provides the exciting parsers is ~ 2.4 times larger than excitingtools (quantified by installing the respective packages in venvs and looking at the size diff). There's a noticeable difference in the number of package dependencies (see below), but in general, the statement is ambiguous and not meaningful, so I think we can remove it.
excitingtools dependencies:
contourpy-1.0.7 cycler-0.11.0 fonttools-4.38.0 importlib-resources-5.12.0 kiwisolver-1.4.4 matplotlib-3.7.0 numpy-1.24.2 packaging-23.0 pillow-9.4.0 pyparsing-3.0.9 python-dateutil-2.8.2 six-1.16.0 wheel-0.38.4 zipp-3.14.0
NOMAD dependencies:
Pint-0.17 aniso8601-7.0.0 anyio-3.6.2 ase-3.19.0 cachetools-4.2.4 certifi-2022.12.7 charset-normalizer-2.0.12 click-7.1.2 contourpy-1.0.7 cycler-0.11.0 cython-0.29.33 docstring-parser-0.12 ecdsa-0.18.0 elasticsearch-7.17.9 elasticsearch-dsl-7.4.0 excitingparser-1.0 fastentrypoints-0.12 fonttools-4.38.0 future-0.18.2 h11-0.12.0 httpcore-0.14.7 httpx-0.22.0 idna-3.4 importlib-resources-5.12.0 jmespath-0.10.0 kiwisolver-1.4.4 matplotlib-3.7.0 memoization-0.4.0 nomad-lab-1.0.8 nptyping-1.4.4 numpy-1.21.2 orjson-3.6.0 packaging-23.0 pillow-9.4.0 pyasn1-0.4.8 pydantic-1.8.2 pyparsing-3.0.9 python-dateutil-2.8.2 python-jose-3.3.0 python-keycloak-0.26.1 pytz-2021.1 pyyaml-6.0 requests-2.26.0 rfc3986-1.5.0 rsa-4.9 scipy-1.10.1 six-1.16.0 sniffio-1.3.0 typing-extensions-4.5.0 typish-1.9.3 urllib3-1.26.14 wheel-0.38.4 zipp-3.14.0
Use of conda/mamba as alternative for installation, if possible.
I agree with this, but would like to delay this until the summer as I expect the minor version to bump several times in the next 6 months (and would like to reduce the overhead of deploying releases).
In addition, the workability of the tool could be improved by defining a workspace for vscode
I interpret this to mean providing settings.json
, tasks.json
, and launch.json
files. I can add some basic settings that specify formatting, linting and tests. @fjmartinmartinez did you have something more in mind? Additionally I'd be interested to see any snippets that install package requirements into a venv, then ensure all new VS code terminals use this venv. I must admit I use Pycharm for all of my Python development and find it far more pleasurable to work in (but default to VS code for Fortran).
The package is neat, functional and well maintained. Well done @AlexBuccheri!
I would also like to suggest a conda release, this will help you get more users, but is not an absolute requirement.
JobFlow
, atomate
or ASR
, just so users can see the tool in action and how to perform the proposed integration with higher level workflow managers.For the paper:
I would love a few more citations for some of the more general statements eg
Other than that looks great!
@AlexBuccheri sorry, I did not follow up on your response. All looks great to me. As @hmacdope said, the package is neat. What I meant by the workspace is just this: https://code.visualstudio.com/docs/editor/workspaces. Nothing else in mind from my side, and it was just a suggestion. Other than that, well done!
@fjmartinmartinez and @hmacdope thank you both very much for the very positive, constructive reviews.
To address @hmacdope's queries, I'll reply here and put any required changes in the linked MR
Example Integration with Jobflow or ASR
To add a few points for context:
a) Workflow development is very much a work-in-progress (a PhD project)
b) The workflow package is a separate repository, such that responsibilities are clearly separated. Until it reaches a stable version, it will stay on the group’s private Gitlab instance. For now, it will provide single responsibility functions to build workflows and contain the workflows themselves. Although this may change in the future.
c) ASR version 2 is undergoing a rewrite, and perhaps will be released under a different name. Its API is not stable and several features are still missing (being able to treat convergence, for example).
Jobflow Example
A workflow is defined as a series of single-responsibility function calls:
Single-responsibility functions take a calculation (or a result)
Result-handling with Jobflow is included with function decorators, which allow lazy evaluation, meta-data storage, etc.
A calculation is a specialisation of the CalculationIO base class
# Simple convergence workflow, composed of steps defined by us
# Composed and executed by Joblow
""" Execute convergence. """
# read input from input file:
specified_input = read_input(Path("input.yml").absolute())
convergence_criteria = get_convergence_criteria(specified_input)
# setup jobs:
setup_job = setup_exciting_calculation(specified_input)
convergence_job = converge_ngridk(setup_job.output, convergence_criteria, [])
# run workflow:
responses = run_locally(Flow([setup_job, convergence_job]),
store=JobStore(MemoryStore()))
print(responses)
Functions responsible for performing workflow "steps".
In this example, we're performing and evaluating convergence of k-sampling - This could probably be more generic (and I'll catch that in code review).
from typing import Callable
from jobflow import job, Response
from excitingworkflow.src.convergence_criteria.exciting_spectra_convergence_criteria
import set_ngridk_in_input
from excitingworkflow.src.workflows.workflow_results import ConvergenceResult
def converge_recursively_step(calculation, input_value,set_value_in_input: Callable) -> dict:
""" Perform a single convergence step.
:param calculation: define all calculation parameters, how to run and parse results
:param input_value: the current input_value
:param set_value_in_input: how to put the input value in the calculation
:returns: a dictionary with the results from the current run
"""
set_value_in_input(input_value, calculation)
subprocess_result = calculation.run()
if not subprocess_result.success:
raise RuntimeError(f"exciting run was not successful!")
return calculation.parse_output()
@job
def converge_ngridk(calculation, convergence_criteria, results: list):
""" Converge ngridk.
:param calculation: calculation with all the other parameters, how to run and get results
:param convergence_criteria: defines the convergence, how to check, which steps to take and when to end
:param results: capture the convergence results
:returns: the results.
"""
input_value, is_first, is_last = convergence_criteria.get_current_input_value()
output = converge_recursively_step(calculation, input_value, set_ngridk_in_input)
if is_first:
converged = False
result = ConvergenceResult(input_value, calculation.directory, output, converged, early_exit=False)
results = [result]
# check for convergence using the convergence criteria evaluate function
converged, early_exit = convergence_criteria.evaluate(output, results[-1].result)
result = ConvergenceResult(input_value, calculation.directory, output, converged, early_exit)
results.append(result)
if converged or is_last:
return results
# else append the workflow DYNAMICALLY with a new workflow step
return Response(output=results, replace=converge_ngridk(calculation, convergence_criteria, results))
# An exciting calculation is defined with an IO calculator.
# Below is the abstract class for an IO calculator.
# A specific implementation utilitises the methods provided by excitingtools.
class CalculationIO(abc.ABC):
"""Abstract base class for a calculation that is performed
by writing input file/s and parsing the result from a file.
An IO calculation is expected to have:
* A name,
* A working (run) directory,
* A method to write all input files required to run the calculation,
* A method to run the calculation,
* A parser for the outputs of interest.
"""
path_type = Union[str, Path]
def __init__(self, name: str, directory: path_type):
self.name = name
self.directory = Path(directory)
def set_name(self, name: str):
""" Set name to given argument. """
self.name = name
@abc.abstractmethod
def write_inputs(self) -> None:
""" Make run dir and write all input files required for calculation.
"""
self.directory.mkdir(parents=True, exist_ok=True)
...
@abc.abstractmethod
def run(self) -> SubprocessRunResults:
""" Run the calculation.
:return: Subprocess result instance.
"""
...
@abc.abstractmethod
def parse_output(self, *args) -> Union[dict, FileNotFoundError]:
""" Parse one or more output files for calculation.
:return: Dictionary of results.
"""
...
With regards to using ASR
The ASE team has approved the integration of excitingtools into ASE. This is part of ASE's program to standardise its calculator class, and migrate implementation to client codes. This means that the ASE exciting calculator wraps excitingtools functions and uses them to create inputs, read outputs and run calculations. Atomic Simulation Recipes (ASR) integrates with ASE calculators so excitingtools is never explicitly called in ASR workflows using exciting.
Collaboration with the ASR team to create exciting workflows is currently under active development. The ASR framework is undergoing massive changes, which need to stabilise before we can start utilising it in production. We can however run simple workflows, such as this adapted from ASR's tutorials:
import ase
import asr
@asr.instruction('asr.tutorial')
@asr.argument('crystal_structure', type=str)
@asr.argument('element', type=str)
def energy_per_atom(element: str, crystal_structure: str) -> float:
"""Calculate the total energy per atom of material"""
atoms = bulk(element, crystalstructure=crystal_structure)
cls = ase.get_calculator_class('exciting')
atoms.calc = cls()
energy = atoms.get_potential_energy()
num_atoms = len(atoms)
return round(energy / num_atoms, 6)
@asr.instruction('asr.tutorial')
@asr.argument('element', type=str)
def main(element: str) -> str:
"""Calculate lowest energy crystal structure for an element."""
crystal_structures = [
'sc', 'fcc', 'bcc', 'diamond',
]
energies = []
for crystal_structure in crystal_structures:
en = energy(element, crystal_structure)
energies.append(en)
lowest_energy_crystal_structure = min(
zip(crystal_structures, energies),
key=lambda item: item[1]
)[0]
return lowest_energy_crystal_structure
Conda
As both @fjmartinmartinez and @hmacdope recommend a Conda release, we can facilitate this
ML Training Data References
I'll introduce these citations regarding the need for training data (this broadly applies to all fields):
LAPW Basis Complexity
With regards to the details of the LAPW basis set, one can refer to papers that provide overviews of exciting and Wien2K, respectively. Additionally Andris Gulans’ paper discusses the LAPW basis set in the context of obtaining micro-Hartree precision. I’ll add all references to our publication:
As a direct response I can give an overview of the complexities of LAPW. The key point is, whilst there is a systematic route to convergence, there are a lot of variables to consider.
APWs are augmented plane waves, where one partitions the system into atom-centred spheres (muffin tins) and interstitial space. Basis functions are chosen according to the variation in the charge density, in the absence of pseudo-potentials. Plane waves are used in the interstitial region, and close to the core where the charge density rapidly oscillates (the muffin tin region), atomic-like orbitals are used to augment the plane waves (according to some matching conditions at the MT boundaries).
Because the MT radius (R_MT) is a variable, the plane wave cut-off (G_max) is not a useful basis parameter. A smaller MT radius would require a larger G_max to obtain the same precision. The APW cut-off is therefore defined with a dimensionless parameter RGKMAX = R_MT * G_max
The radial basis functions of the atomic-like orbitals are found by solving the radial Schrondinger equation for each species in the system. This means that the APW basis depends on eigenvalues of the radial Schrondinger equation, therefore H and S also become energy-dependent. The secular equation describing the system is therefore non-linear.
To avoid treating the construction of the basis functions <—> solution of the secular equation self-consistently, one can treat the energies of the radial functions as parameters, and sufficiently reduce the error in the basis by expanding the radial functions about this parameter. The matching order (the maximum order of the derivatives used in the expansion) is also specified by the user.
Further flexibility in the basis can be introduced with the inclusion of local orbitals (LOs). These are strictly non-zero in the muffin-tins. Their inclusion further reduces the linearisation error.
All of these aspects of LAPW are described in more detail in references provided.
Additional degrees of freedom in the basis include:
Whilst the casual user does not need to concern themselves with most of these details, it is clearly non-trivial for someone unfamiliar with the basis to systematically improve the precision. Simplifying the determination of many of these quantities is under active development in the SOL group. Additionally, we would like to formalise these steps in an automated workflow, using excitingtools as a vehicle.
@editorialbot commands
Hello @hmacdope, here are the things you can ask me to do:
# List all available commands
@editorialbot commands
# Get a list of all editors's GitHub handles
@editorialbot list editors
# Check the references of the paper for missing DOIs
@editorialbot check references
# Perform checks on the repository
@editorialbot check repository
# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist
# Set a value for branch
@editorialbot set joss-paper as branch
# Generates the pdf paper
@editorialbot generate pdf
# Generates a LaTeX preprint file
@editorialbot generate preprint
# Get a link to the complete list of reviewers
@editorialbot list reviewers
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1088/0953-8984/26/36/363202 is OK
- 10.1103/physrevb.97.161105 is OK
- 10.1103/PhysRevB.94.035118 is OK
- 10.1126/science.aad3000 is OK
- 10.1088/1361-648X/aa680e is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.48550/arXiv.2209.12747 is OK
- 10.1063/1.4812323 is OK
- 10.1016/j.commatsci.2012.02.005 is OK
- 10.1088/2515-7639/ab13bb is OK
MISSING DOIs
- None
INVALID DOIs
- None
@AlexBuccheri thanks for your thoughtful and detailed replies.
Other than that I am totally satisfied, the wait on full conda-package deployment can be a little while, but I will be happy to accept when you link a PR adding excitingtools
against conda-forge/staged-recipies
. Let me know if you think this is unreasonable. : )
@hmacdope W.R.T. staged recipes, no problem: https://github.com/conda-forge/staged-recipes/pull/22386
MR 6 has been merged, implementing the changes requested in this review. That should resolve all remaining threads.
I think we are ready to go then @richardjgowers :)
@AlexBuccheri
At this point could you:
I can then move forward with recommending acceptance of the submission
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1088/0953-8984/26/36/363202 is OK
- 10.1103/physrevb.97.161105 is OK
- 10.1103/PhysRevB.94.035118 is OK
- 10.1126/science.aad3000 is OK
- 10.1088/1361-648X/aa680e is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.48550/arXiv.2209.12747 is OK
- 10.1063/1.4812323 is OK
- 10.1016/j.commatsci.2012.02.005 is OK
- 10.1088/2515-7639/ab13bb is OK
MISSING DOIs
- 10.1007/978-3-319-44677-6_104 may be a valid DOI for title: Big data-driven materials science and its FAIR data infrastructure
- 10.1038/s42256-021-00319-w may be a valid DOI for title: Unassisted noise reduction of chemical reaction datasets
- 10.1093/bib/bbab391 may be a valid DOI for title: Deep learning in retrosynthesis planning: datasets, models and tools
- 10.1063/1.5143061 may be a valid DOI for title: WIEN2k: An APW+ lo program for calculating the properties of solids
- 10.1103/physrevb.12.3060 may be a valid DOI for title: Linear methods in band theory
INVALID DOIs
- None
@editorialbot set v1.3.0 as version
Done! version is now v1.3.0
@editorialbot set 10.5281/zenodo.7799887 as doi
I'm sorry human, I don't understand that. You can see what commands I support by typing:
@editorialbot commands
@editorialbot set 10.5281/zenodo.7799887 as archive
Done! Archive is now 10.5281/zenodo.7799887
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@AlexBuccheri the bot suggested some DOIs for a few of the references without DOIs. Can you go through and check if they're correct (some seem to be) and add them, then that will be the final thing to do
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1088/0953-8984/26/36/363202 is OK
- 10.1103/physrevb.97.161105 is OK
- 10.1103/PhysRevB.94.035118 is OK
- 10.1126/science.aad3000 is OK
- 10.1088/1361-648X/aa680e is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.48550/arXiv.2209.12747 is OK
- 10.1063/1.4812323 is OK
- 10.1016/j.commatsci.2012.02.005 is OK
- 10.1088/2515-7639/ab13bb is OK
- 10.1007/978-3-319-44677-6_104 is OK
- 10.1038/s42256-021-00319-w is OK
- 10.1093/bib/bbab391 is OK
- 10.1063/1.5143061 is OK
- 10.1103/physrevb.12.3060 is OK
MISSING DOIs
- None
INVALID DOIs
- None
@richardjgowers Sorry, I was only looking INVALID DOIs. Should be resolved now.
@richardjgowers I believe all of the issues have been resolved and the project is ready for accepting.
@editorialbot recommend-accept
Attempting dry run of processing paper acceptance...
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1088/0953-8984/26/36/363202 is OK
- 10.1103/physrevb.97.161105 is OK
- 10.1103/PhysRevB.94.035118 is OK
- 10.1126/science.aad3000 is OK
- 10.1088/1361-648X/aa680e is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.48550/arXiv.2209.12747 is OK
- 10.1063/1.4812323 is OK
- 10.1016/j.commatsci.2012.02.005 is OK
- 10.1088/2515-7639/ab13bb is OK
- 10.1007/978-3-319-44677-6_104 is OK
- 10.1038/s42256-021-00319-w is OK
- 10.1093/bib/bbab391 is OK
- 10.1063/1.5143061 is OK
- 10.1103/physrevb.12.3060 is OK
MISSING DOIs
- None
INVALID DOIs
- None
@editorialbot commands
Hello @AlexBuccheri, here are the things you can ask me to do:
# List all available commands
@editorialbot commands
# Get a list of all editors's GitHub handles
@editorialbot list editors
# Check the references of the paper for missing DOIs
@editorialbot check references
# Perform checks on the repository
@editorialbot check repository
# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist
# Set a value for branch
@editorialbot set joss-paper as branch
# Generates the pdf paper
@editorialbot generate pdf
# Generates a LaTeX preprint file
@editorialbot generate preprint
# Get a link to the complete list of reviewers
@editorialbot list reviewers
@AlexBuccheri sorry, I thought this got passed onto the editors before. @editorialbot recommend acceptance
@editorialbot ping track-eic
:bellhop_bell::exclamation:Hey @openjournals/pe-eics, this submission requires your attention.
Submitting author: !--author-handle-->@AlexBuccheri<!--end-author-handle-- (Alexander Buccheri) Repository: https://github.com/exciting/excitingtools Branch with paper.md (empty if default branch): Version: v1.3.0 Editor: !--editor-->@richardjgowers<!--end-editor-- Reviewers: @fjmartinmartinez, @hmacdope Archive: 10.5281/zenodo.7799887
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@fjmartinmartinez & @hmacdope, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @richardjgowers know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @fjmartinmartinez
📝 Checklist for @hmacdope