3dem / relion

Image-processing software for cryo-electron microscopy
https://relion.readthedocs.io/en/latest/
GNU General Public License v2.0
444 stars 197 forks source link

Document CLI #912

Open multimeric opened 2 years ago

multimeric commented 2 years ago

I note that Relion has a CLI, which a ton of binaries. For example:

relion                             relion_external_reconstruct        relion_motion_refine               relion_preprocess                  relion_stack_create
relion_align_symmetry              relion_find_tiltpairs              relion_motion_refine_mpi           relion_preprocess_mpi              relion_star_datablock_ctfdat
relion_autopick                    relion_flex_analyse                relion_mrc2vtk                     relion_project                     relion_star_datablock_singlefiles
relion_autopick_mpi                relion_flex_analyse_mpi            relion_particle_FCC                relion_qsub.csh                    relion_star_datablock_stack
relion_convert_star                relion_helix_inimodel2d            relion_particle_reposition         relion_reconstruct                 relion_star_handler
relion_convert_to_tiff             relion_helix_toolbox               relion_particle_select             relion_reconstruct_mpi             relion_star_loopheader
relion_convert_to_tiff_mpi         relion_image_handler               relion_particle_subtract           relion_refine                      relion_star_plottable
relion_ctf_mask_test               relion_import                      relion_particle_subtract_mpi       relion_refine_mpi                  relion_star_printtable
relion_ctf_refine                  relion_localsym                    relion_particle_symmetry_expand    relion_reposition                  relion_tiltpair_plot
relion_ctf_refine_mpi              relion_localsym_mpi                relion_pipeliner                   relion_run_ctffind                 relion_tomo_test
relion_ctf_toolbox                 relion_maingui                     relion_plot_delocalisation         relion_run_ctffind_mpi             
relion_demodulate                  relion_manualpick                  relion_postprocess                 relion_run_motioncorr              
relion_display                     relion_mask_create                 relion_postprocess_mpi             relion_run_motioncorr_mpi          
relion_estimate_gain               relion_merge_particles             relion_prepare_subtomo             relion_scheduler                   

Also, each of these binaries has a whole number of options:

+++ RELION: command line arguments (with defaults for optional ones between parantheses) +++
====== General options ===== 
                                --i : Input STAR file, image (.mrc) or movie/stack (.mrcs)
                             --o () : Output name (for STAR-input: insert this string before each image's extension)
====== image-by-constant operations ===== 
            --multiply_constant (1) : Multiply the image(s) pixel values by this constant
              --divide_constant (1) : Divide the image(s) pixel values by this constant
                --add_constant (0.) : Add this constant to the image(s) pixel values
           --subtract_constant (0.) : Subtract this constant from the image(s) pixel values
           --threshold_above (999.) : Set all values higher than this value to this value
          --threshold_below (-999.) : Set all values lower than this value to this value
====== image-by-image operations ===== 
                      --multiply () : Multiply input image(s) by the pixel values in this image
                        --divide () : Divide input image(s) by the pixel values in this image
                           --add () : Add the pixel values in this image to the input image(s) 
                      --subtract () : Subtract the pixel values in this image to the input image(s) 
                           --fsc () : Calculate FSC curve of the input image with this image
                    --power (false) : Calculate power spectrum (|F|^2) of the input image
                  --adjust_power () : Adjust the power spectrum of the input image to be the same as this image 
                --fourier_filter () : Multiply the Fourier transform of the input image(s) with this one image 

In order to help support researchers using the CLI, I'm interested in adding docs for these command line tools. Are you interested in this feature, and if so, can I have some guidance about how to edit the user manual?

biochem-fan commented 2 years ago

Thank you very much for your suggestion. I do think command line documentation is good and necessary, but this is a daunting task in reality.

Things to consider:

Yes, this is a mess, accumulating many years of exploratory researches.

Ideally, we should sort this out by adding more checks, grouping options by expert levels, making commands more general and documenting remaining assumptions. But these take huge efforts, which I am not sure if core developers can afford.

Perhaps a pragmatic approach is to start documenting commands and options that are not accessible from the GUI but are often used. Candidates include:

(Even developers rarely use other commands from CLI)

multimeric commented 2 years ago

The approach I am hoping to take is twofold.

Firstly I'm aiming to create a comprehensive CLI page which contains a raw dump of every command and every option. This is implemented in https://github.com/3dem/relion-documents/pull/2. I'm aware that this violates some of the guidelines that you have just provided, but I think it still provides the benefits of:

Also, if and when the devs are able to improve some of the CLI descriptions as you have suggested (e.g. in terms of grouping flags), this will automatically include those improvements.

The second step I am hoping to do is a higher level documentation. I was thinking of adding a "CLI version" of some of the tutorials, which just shows the equivalent CLI command for each of the steps in, for example, this tutorial. However I'm happy to be guided on this. I could instead write a separate new tutorial, or perhaps make a small CLI user guide for the list of useful commands you have provided.


Most commands are supposed to be executed from the GUI

Right, but one use case I want to support is using this from a workflow manager etc to automate a workflow (or at least parts of it), so we can't/don't want to use a GUI for this use case

Many commands have highly technical options that are relevant only in special cases and should be used by experts who know what they are doing. Inadequate parameters can lead to scientifically invalid results (e.g. resolution overestimation) and are dangerous.

Some options are not orthogonal, i.e., cannot be used simultaneously.

Some options have implicit assumptions on inputs (e.g. Cannot use particle.star file after highpass filter with image handler - ERROR: readMRC: Image number 50 exceeds stack size 10 of image #911 (comment))

I'm hoping the higher level user guide would be able to explain these potential pitfalls.

biochem-fan commented 2 years ago

I am rather ambivalent regarding the first step and your pull request. I agree that it can be useful for some people; at the same time I would say one can just run the command locally to see the list of arguments. I am not sure if we want to actively advertise the CLI considering that there are many caveats. I will leave the decision to @scheres whether we merge your pull request.

Regarding the second step, high-level documentation:

Right, but one use case I want to support is using this from a workflow manager etc to automate a workflow (or at least parts of it), so we can't/don't want to use a GUI for this use case

Such automation and workflow management are exactly what RELION (and newer CCPEM) schedulers do. They have command line interface. Users build a workflow comprising multiple jobs in the GUI (or a CCPEM Web interface) and the scheduler executes it.

Honestly speaking I am against building command line arguments by hand except for commands I listed above. So I am not very keen on making a CLI version of the RELION tutorial. On the other hand, I am positive about writing high-level documentations for relion_image_handler, relion_reconstruct etc.

multimeric commented 2 years ago

I understand your concerns, and I'm happy to add a disclaimer to the top saying something like "we recommend generating CLI commands using the GUI where possible". On the other hand though, I think users who seek out the CLI for tools are often quite used to the kinds of problems you mention where there are implicit rules about how to execute each tool. I think empowering them is a useful goal.

To give you another motivating example. I'm looking into a project that involves benchmarking some relion workflows on our HPC, and determining how we can optimise it. As someone new to the cryoEM world, this wasn't initially clear if this was even possible, because it seemed to be a GUI app without a CLI. Having a reference like this helps to advertise this capability of relion.

mopk commented 1 year ago

Any decision regarding the pull request? My fifty cents is that CLI with documentation would allow RELION performance benchmark automation in particular. If it were in place one could take known input data having known good enough results and run same scripted sets of phases/jobs differing only in small subsets of parameters related to hardware configuration.

So I have same task here. I also need to somehow optimise RELION workflows on HPC.