scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.86k stars 595 forks source link

Build process having some issues #1909

Closed ivirshup closed 2 years ago

ivirshup commented 3 years ago

Current build process

Current build process (as used on azure) does not include the LICENSE, as well as a number of other files (which may note be necessary)

git checkout 1.8.0
python -m build --sdist --wheel .
tar tzf dist/scanpy-1.8.0.tar.gz   
contents of source dist ``` scanpy-1.8.0/README.rst scanpy-1.8.0/pyproject.toml scanpy-1.8.0/scanpy/__init__.py scanpy-1.8.0/scanpy/__main__.py scanpy-1.8.0/scanpy/_compat.py scanpy-1.8.0/scanpy/_metadata.py scanpy-1.8.0/scanpy/_settings.py scanpy-1.8.0/scanpy/_utils/__init__.py scanpy-1.8.0/scanpy/_utils/compute/is_constant.py scanpy-1.8.0/scanpy/cli.py scanpy-1.8.0/scanpy/datasets/10x_pbmc68k_reduced.h5ad scanpy-1.8.0/scanpy/datasets/__init__.py scanpy-1.8.0/scanpy/datasets/_datasets.py scanpy-1.8.0/scanpy/datasets/_ebi_expression_atlas.py scanpy-1.8.0/scanpy/datasets/_utils.py scanpy-1.8.0/scanpy/datasets/krumsiek11.txt scanpy-1.8.0/scanpy/datasets/toggleswitch.txt scanpy-1.8.0/scanpy/external/__init__.py scanpy-1.8.0/scanpy/external/exporting.py scanpy-1.8.0/scanpy/external/pl.py scanpy-1.8.0/scanpy/external/pp/__init__.py scanpy-1.8.0/scanpy/external/pp/_bbknn.py scanpy-1.8.0/scanpy/external/pp/_dca.py scanpy-1.8.0/scanpy/external/pp/_harmony_integrate.py scanpy-1.8.0/scanpy/external/pp/_hashsolo.py scanpy-1.8.0/scanpy/external/pp/_magic.py scanpy-1.8.0/scanpy/external/pp/_mnn_correct.py scanpy-1.8.0/scanpy/external/pp/_scanorama_integrate.py scanpy-1.8.0/scanpy/external/pp/_scrublet.py scanpy-1.8.0/scanpy/external/tl/__init__.py scanpy-1.8.0/scanpy/external/tl/_harmony_timeseries.py scanpy-1.8.0/scanpy/external/tl/_palantir.py scanpy-1.8.0/scanpy/external/tl/_phate.py scanpy-1.8.0/scanpy/external/tl/_phenograph.py scanpy-1.8.0/scanpy/external/tl/_pypairs.py scanpy-1.8.0/scanpy/external/tl/_sam.py scanpy-1.8.0/scanpy/external/tl/_trimap.py scanpy-1.8.0/scanpy/external/tl/_wishbone.py scanpy-1.8.0/scanpy/get/__init__.py scanpy-1.8.0/scanpy/get/get.py scanpy-1.8.0/scanpy/logging.py scanpy-1.8.0/scanpy/metrics/__init__.py scanpy-1.8.0/scanpy/metrics/_gearys_c.py scanpy-1.8.0/scanpy/metrics/_metrics.py scanpy-1.8.0/scanpy/metrics/_morans_i.py scanpy-1.8.0/scanpy/neighbors/__init__.py scanpy-1.8.0/scanpy/plotting/__init__.py scanpy-1.8.0/scanpy/plotting/_anndata.py scanpy-1.8.0/scanpy/plotting/_baseplot_class.py scanpy-1.8.0/scanpy/plotting/_docs.py scanpy-1.8.0/scanpy/plotting/_dotplot.py scanpy-1.8.0/scanpy/plotting/_matrixplot.py scanpy-1.8.0/scanpy/plotting/_preprocessing.py scanpy-1.8.0/scanpy/plotting/_qc.py scanpy-1.8.0/scanpy/plotting/_rcmod.py scanpy-1.8.0/scanpy/plotting/_stacked_violin.py scanpy-1.8.0/scanpy/plotting/_tools/__init__.py scanpy-1.8.0/scanpy/plotting/_tools/paga.py scanpy-1.8.0/scanpy/plotting/_tools/scatterplots.py scanpy-1.8.0/scanpy/plotting/_utils.py scanpy-1.8.0/scanpy/plotting/palettes.py scanpy-1.8.0/scanpy/preprocessing/__init__.py scanpy-1.8.0/scanpy/preprocessing/_combat.py scanpy-1.8.0/scanpy/preprocessing/_deprecated/__init__.py scanpy-1.8.0/scanpy/preprocessing/_deprecated/highly_variable_genes.py scanpy-1.8.0/scanpy/preprocessing/_distributed.py scanpy-1.8.0/scanpy/preprocessing/_docs.py scanpy-1.8.0/scanpy/preprocessing/_highly_variable_genes.py scanpy-1.8.0/scanpy/preprocessing/_normalization.py scanpy-1.8.0/scanpy/preprocessing/_pca.py scanpy-1.8.0/scanpy/preprocessing/_qc.py scanpy-1.8.0/scanpy/preprocessing/_recipes.py scanpy-1.8.0/scanpy/preprocessing/_simple.py scanpy-1.8.0/scanpy/preprocessing/_utils.py scanpy-1.8.0/scanpy/queries/__init__.py scanpy-1.8.0/scanpy/queries/_queries.py scanpy-1.8.0/scanpy/readwrite.py scanpy-1.8.0/scanpy/sim_models/__init__.py scanpy-1.8.0/scanpy/sim_models/krumsiek11.txt scanpy-1.8.0/scanpy/sim_models/krumsiek11_params.txt scanpy-1.8.0/scanpy/sim_models/toggleswitch.txt scanpy-1.8.0/scanpy/sim_models/toggleswitch_params.txt scanpy-1.8.0/scanpy/tools/__init__.py scanpy-1.8.0/scanpy/tools/_dendrogram.py scanpy-1.8.0/scanpy/tools/_diffmap.py scanpy-1.8.0/scanpy/tools/_dpt.py scanpy-1.8.0/scanpy/tools/_draw_graph.py scanpy-1.8.0/scanpy/tools/_embedding_density.py scanpy-1.8.0/scanpy/tools/_ingest.py scanpy-1.8.0/scanpy/tools/_leiden.py scanpy-1.8.0/scanpy/tools/_louvain.py scanpy-1.8.0/scanpy/tools/_marker_gene_overlap.py scanpy-1.8.0/scanpy/tools/_paga.py scanpy-1.8.0/scanpy/tools/_pca.py scanpy-1.8.0/scanpy/tools/_rank_genes_groups.py scanpy-1.8.0/scanpy/tools/_score_genes.py scanpy-1.8.0/scanpy/tools/_sim.py scanpy-1.8.0/scanpy/tools/_top_genes.py scanpy-1.8.0/scanpy/tools/_tsne.py scanpy-1.8.0/scanpy/tools/_umap.py scanpy-1.8.0/scanpy/tools/_utils.py scanpy-1.8.0/scanpy/tools/_utils_clustering.py scanpy-1.8.0/PKG-INFO ```

Flit build

Using flit to build it includes all the files I would expect, but get's the version wrong on the wheel

rm -r dist
flit build
ls dist
scanpy-1.8.0.dev112+g1a3ae03c.d20210628-py3-none-any.whl
scanpy-1.8.0.tar.gz

setup.py

Is including a bunch of files it shouldn't and is deprecated anyways.


@flying-sheep @Zethson

flying-sheep commented 3 years ago

Did you fix this already? The CheckBuild job’s task “Build & Twine check” looks fine to me:

For master:

Installing collected packages: numpy, threadpoolctl, ...
Successfully installed anndata-0.7.6 cycler-0.10.0 ...
Checking dist/scanpy-1.9.0.dev2+gc9b59137-py3-none-any.whl: PASSED
Checking dist/scanpy-1.9.0.dev2+gc9b59137.tar.gz: PASSED, with warnings
  warning: `long_description_content_type` missing. defaulting to `text/x-rst`.
  warning: `long_description` missing.

For the 1.8.x branch:

Installing collected packages: numpy, threadpoolctl, ...
Successfully installed anndata-0.7.6 cycler-0.10.0 ...
Checking dist/scanpy-1.8.1.dev5+gbcbcbccf-py3-none-any.whl: PASSED
Checking dist/scanpy-1.8.1.dev5+gbcbcbccf.tar.gz: PASSED, with warnings
  warning: `long_description_content_type` missing. defaulting to `text/x-rst`.
  warning: `long_description` missing.
ivirshup commented 3 years ago

Yeah, the task is running fine, but it's not including the license locally. It's also including a different set of files than flit does, which seems like a configuration issue.

I think we need to add some more checks to the build task.

flying-sheep commented 3 years ago

Hm, flit uses the files tracked by git by default, but things can be overridden: https://flit.readthedocs.io/en/latest/pyproject_toml.html?highlight=git#sdist-section

I wonder why it behaves differently when -m build uses flit_core vs when the flit CLI is used. @takluyver do you know why this is?

takluyver commented 3 years ago

The short answer is that flit_core, which provides the PEP 517 hooks, makes a minimal sdist which should always have the files you need to install the module, but may leave out e.g. tests and docs. The Flit CLI tries to make a 'publication quality' sdist.

It's kind of an ugly compromise, because how I approached sdists (before PEP 517) wasn't a good fit for the PEP 517 build_sdist hook. I view sdists on PyPI as like a snapshot of the development process, so it should (by default) include everything that you'd get if you checked out the corresponding tag from git (except the git history). But using git assumes that it's something the maintainer makes once and publishes on PyPI. PEP 517 defined a build_sdist hook which user tools (like pip) can call. I didn't want this to depend on git, so I gave it a way to make working but minimal sdists.

Specifying includes & excludes under [tool.flit.sdist] should affect both the Flit CLI and the PEP 517 hooks. So if you want to make the sdists to publish with python -m build or similar, you'll need to use those to determine what goes in.

ivirshup commented 3 years ago

@flying-sheep, figured out (part of) why I'm seeing the wrong version for the wheel, but the right version for the sdist via flit build. When the wheel is built, it hits this case:

https://github.com/theislab/scanpy/blob/3c7256560f53374ecf960bc329405912da8d6efe/scanpy/_metadata.py#L35-L41

Where it sees the currently installed version, which may not be aware of new tag (since I'm making the build in a clean directory). I imagine this is why using python -m build get's it right, since it's building everything in a fresh environment.

However, getting this issue with the METADATA for the sdist generating warnings, but not having this problem with the wheel:

$ twine check dist/*
Checking dist/scanpy-1.8.1-py3-none-any.whl: PASSED
Checking dist/scanpy-1.8.1.tar.gz: PASSED, with warnings
  warning: `long_description_content_type` missing. defaulting to `text/x-rst`.
  warning: `long_description` missing.
flying-sheep commented 3 years ago

Thank you for the detailed explanation @takluyver, this helps a lot!

@ivirshup A dev environment should successfully execute the try block. I designed that except clause to be hit when people import the installed production version.

ivirshup commented 2 years ago

Seems to be resolved