ebi-gene-expression-group / scanpy-scripts

Scripts for using scanpy
Apache License 2.0
29 stars 13 forks source link

What is scanpy-run-umap default for --maxiter INTEGER #120

Closed swbioinf closed 1 year ago

swbioinf commented 1 year ago

Hi,

I notice in the --help for scanpy-run-umap there's no listed default for --maxiter? What is the default? I can't figure it out from the code.

NB: I'm trying to debug an issue within galaxy (just a user, I'll talk to admin folk there too) - where it looks like if I don't specify --maxiter, I get a very spikey UMAP plot that looks like it was run with a very low maxiter (like if I set it to 1 within galaxy, but not locally - perhaps a version/umap implementation difference there.) Not sure where the default is being defined to check. Haven't entirely been able to reproduce this on my local scanpy-scripts.

Thanks

NB: the --help without a default

 --n-components INTEGER          The number of dimensions of the embedding.
                                  [default: 2]
  --maxiter INTEGER               The number of iterations of the
                                  optimization.
  --alpha FLOAT                   The initial learning rate for the embedding
                                  optimization.  [default: 1.0]
pcm32 commented 1 year ago

The default of scanpy-scripts for maxiter is None:

https://github.com/ebi-gene-expression-group/scanpy-scripts/blob/e53693336d8b37f0231d10d672b49c766d9c325b/scanpy_scripts/cmd_options.py#L1024-L1030

Which means that scanpy-scripts will leave scanpy to use its own default. According to scanpy, it's default is also None (https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.umap.html).

On Galaxy, I would expect the current default should be to just use the default of the underlying packages...

...yes, is optional with no default:

https://github.com/ebi-gene-expression-group/container-galaxy-sc-tertiary/blob/develop/tools/tertiary-analysis/scanpy/scanpy-run-umap.xml#L73

Scanpy scripts uses Click, which sort of automates the documentation. So if it says no default, it must usually be None I guess.

pcm32 commented 1 year ago

So I guess the answer is that you need to play around with the value :-). I'm going to close this issue, but don't hesitate to get in touch again. Good luck!

pcm32 commented 1 year ago

To help reproduce results, make sure that you are using the same versions and same random seed (UMAP is stocastic).

swbioinf commented 1 year ago

Thanks @pcm32 - that answers that :)