ESSsans has seen most development and provides already a number of different workflows. There have been several iterations of improvements already. While this is by no means the final state, we should sit down together and:
See what we still want to improve.
Collect the "best practices" and write them down.
We should then try to apply those practices for the other ESS* projects, to ensure solutions are not duplicated, and for easing developers switching between different projects.
Split handling (loading) of monitors (events), detectors (events), and remainder
Structure for masking any dim or transformed dim, in various steps
Naming conventions
Package and module structure (where to place types? what does where?)
How to extract meta data (avoiding keeping large data alive)
Write unit tests for providers, not (exclusively) for entire workflows
Require passing mypy for workflows (would include making it part of CI)?
How to handle optional steps
How to handle optional inputs
Should we have default params set in workflows?
How to save output files
How to handle provenance
Loading from SciCat vs. local files
Performance guidelines (how to avoid pitfalls around event data, or large temporary dims)
How to define parameters, such that we can, e.g., auto generate widgets for user input (names, description, limits, default values, ...)
Docstrings: Include math, references, ...
Status
Can be defined now
Docstrings: Include math, references, ...
Must be able to return event data (required for polarization analysis)
Write unit tests for providers, not (exclusively) for entire workflows
Naming conventions (and type conventions (example: filenames)?)
Package and module structure (where to place types? what goes where?)
Requires minor discussion
Loading from SciCat vs. local files (e.g., define run ID, choose provider that either converts to local path, or uses service to get file and return path)
Split handling (loading) of monitors (events), detectors (events), and remainder
How to extract meta data (avoiding keeping large data alive)
Concern about large data resolved by loading event data (monitors and detectors) separately from the rest
Do not write files (or to services) in providers.
Performance guidelines (how to avoid/detect pitfalls around event data, or large temporary dims)
Every workflow should be tested with large data and checked for memory consumption and performance bottlenecks
Add logical dims when loading NeXus files.
Should we have default params set in workflows?
Avoid unless good reason.
Can have widgets that generate dict of params and values, widgets can have defaults
Requires more information or experimentation
How to define parameters, such that we can, e.g., auto generate widgets for user input (names, description, limits, default values, ...)
Range checks / validators
If part of pipeline then UX and writing providers is more cumbersome
Default values?
Requires experimentation with how Sciline handles param tables, and transformations of task graphs
ESSsans has seen most development and provides already a number of different workflows. There have been several iterations of improvements already. While this is by no means the final state, we should sit down together and:
We should then try to apply those practices for the other ESS* projects, to ensure solutions are not duplicated, and for easing developers switching between different projects.