Open multimeric opened 2 years ago
Thank you very much for your suggestion. I do think command line documentation is good and necessary, but this is a daunting task in reality.
Things to consider:
Yes, this is a mess, accumulating many years of exploratory researches.
Ideally, we should sort this out by adding more checks, grouping options by expert levels, making commands more general and documenting remaining assumptions. But these take huge efforts, which I am not sure if core developers can afford.
Perhaps a pragmatic approach is to start documenting commands and options that are not accessible from the GUI but are often used. Candidates include:
relion_image_handler
(--rescale_angpix
, --new_box
, --force_header_angpix
)relion_convert_to_tiff
(already documented)relion_reconstruct
(--ctf
, --sym
, --skip_gridding
, --pad
)relion_helix_inimodel2d
(Even developers rarely use other commands from CLI)
The approach I am hoping to take is twofold.
Firstly I'm aiming to create a comprehensive CLI page which contains a raw dump of every command and every option. This is implemented in https://github.com/3dem/relion-documents/pull/2. I'm aware that this violates some of the guidelines that you have just provided, but I think it still provides the benefits of:
Also, if and when the devs are able to improve some of the CLI descriptions as you have suggested (e.g. in terms of grouping flags), this will automatically include those improvements.
The second step I am hoping to do is a higher level documentation. I was thinking of adding a "CLI version" of some of the tutorials, which just shows the equivalent CLI command for each of the steps in, for example, this tutorial. However I'm happy to be guided on this. I could instead write a separate new tutorial, or perhaps make a small CLI user guide for the list of useful commands you have provided.
Most commands are supposed to be executed from the GUI
Right, but one use case I want to support is using this from a workflow manager etc to automate a workflow (or at least parts of it), so we can't/don't want to use a GUI for this use case
Many commands have highly technical options that are relevant only in special cases and should be used by experts who know what they are doing. Inadequate parameters can lead to scientifically invalid results (e.g. resolution overestimation) and are dangerous.
Some options are not orthogonal, i.e., cannot be used simultaneously.
Some options have implicit assumptions on inputs (e.g. Cannot use particle.star file after highpass filter with image handler - ERROR: readMRC: Image number 50 exceeds stack size 10 of image #911 (comment))
I'm hoping the higher level user guide would be able to explain these potential pitfalls.
I am rather ambivalent regarding the first step and your pull request. I agree that it can be useful for some people; at the same time I would say one can just run the command locally to see the list of arguments. I am not sure if we want to actively advertise the CLI considering that there are many caveats. I will leave the decision to @scheres whether we merge your pull request.
Regarding the second step, high-level documentation:
Right, but one use case I want to support is using this from a workflow manager etc to automate a workflow (or at least parts of it), so we can't/don't want to use a GUI for this use case
Such automation and workflow management are exactly what RELION (and newer CCPEM) schedulers do. They have command line interface. Users build a workflow comprising multiple jobs in the GUI (or a CCPEM Web interface) and the scheduler executes it.
Honestly speaking I am against building command line arguments by hand except for commands I listed above. So I am not very keen on making a CLI version of the RELION tutorial. On the other hand, I am positive about writing high-level documentations for relion_image_handler
, relion_reconstruct
etc.
I understand your concerns, and I'm happy to add a disclaimer to the top saying something like "we recommend generating CLI commands using the GUI where possible". On the other hand though, I think users who seek out the CLI for tools are often quite used to the kinds of problems you mention where there are implicit rules about how to execute each tool. I think empowering them is a useful goal.
To give you another motivating example. I'm looking into a project that involves benchmarking some relion workflows on our HPC, and determining how we can optimise it. As someone new to the cryoEM world, this wasn't initially clear if this was even possible, because it seemed to be a GUI app without a CLI. Having a reference like this helps to advertise this capability of relion.
Any decision regarding the pull request? My fifty cents is that CLI with documentation would allow RELION performance benchmark automation in particular. If it were in place one could take known input data having known good enough results and run same scripted sets of phases/jobs differing only in small subsets of parameters related to hardware configuration.
So I have same task here. I also need to somehow optimise RELION workflows on HPC.
I note that Relion has a CLI, which a ton of binaries. For example:
Also, each of these binaries has a whole number of options:
In order to help support researchers using the CLI, I'm interested in adding docs for these command line tools. Are you interested in this feature, and if so, can I have some guidance about how to edit the user manual?