hail-is / hail

Cloud-native genomic dataframes and batch computing
https://hail.is
MIT License
953 stars 238 forks source link

Hail 0.2.120 - hail-on-cluster - pip-compile: command not found #13445

Closed mhebrard closed 10 months ago

mhebrard commented 11 months ago

What happened?

I am trying to install Hail v0.2.120 on AWS EMR 6.9.0

Versions:

After updating Python to 3.8 and cloning hail repo, I compile hail using the command below

sudo make install-on-cluster HAIL_COMPILE_NATIVES=1 SCALA_VERSION=2.12.15 SPARK_VERSION=3.3.0

Here I get an error

+ pip-compile --quiet python/requirements.txt python/pinned-requirements.txt --output-file=/tmp/tmp.aWUFJ1BMnP
../check_pip_requirements.sh: line 13: pip-compile: command not found

While I do have pip-compile installed

pip-compile --help
Usage: pip-compile [OPTIONS] [SRC_FILES]...

  Compiles requirements.txt from requirements.in, pyproject.toml, setup.cfg,
  or setup.py specs.

Options:

Note that make clean did not solve the issue

see logs attached

Version

0.2.120

Relevant log output

BUILD SUCCESSFUL in 2m 46s
4 actionable tasks: 4 executed
cp -f build/libs/hail-all-spark.jar python/hail/backend/hail-all-spark.jar
rm -rf build/deploy
mkdir -p build/deploy
mkdir -p build/deploy/src
cp ../README.md build/deploy/
rsync -r \
    --exclude '.eggs/' \
    --exclude '.pytest_cache/' \
    --exclude '__pycache__/' \
    --exclude 'benchmark_hail/' \
    --exclude '.mypy_cache/' \
    --exclude 'docs/' \
    --exclude 'dist/' \
    --exclude 'test/' \
    --exclude '*.log' \
    python/ build/deploy/
# Clear the bdist build cache before building the wheel
cd build/deploy; rm -rf build; python3 setup.py -q sdist bdist_wheel
/usr/lib64/python3.8/distutils/dist.py:274: UserWarning: Unknown distribution option: 'long_description_content_type'
  warnings.warn(msg)
installing to build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/hail-0.2.120.dist-info/WHEEL
creating 'dist/hail-0.2.120-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'hail/__init__.py'
adding 'hail/builtin_references.py'
adding 'hail/conftest.py'
adding 'hail/context.py'
adding 'hail/hail_logging.py'
adding 'hail/hail_pip_version'
adding 'hail/hail_revision'
adding 'hail/hail_version'
adding 'hail/matrixtable.py'
adding 'hail/table.py'
adding 'hail/backend/__init__.py'
adding 'hail/backend/backend.py'
adding 'hail/backend/hail-all-spark.jar'
adding 'hail/backend/local_backend.py'
adding 'hail/backend/py4j_backend.py'
adding 'hail/backend/service_backend.py'
adding 'hail/backend/spark_backend.py'
adding 'hail/experimental/__init__.py'
adding 'hail/experimental/codec.py'
adding 'hail/experimental/compile.py'
adding 'hail/experimental/datasets.json'
adding 'hail/experimental/datasets.py'
adding 'hail/experimental/db.py'
adding 'hail/experimental/export_entries_by_col.py'
adding 'hail/experimental/expressions.py'
adding 'hail/experimental/filtering_allele_frequency.py'
adding 'hail/experimental/full_outer_join_mt.py'
adding 'hail/experimental/function.py'
adding 'hail/experimental/haplotype_freq_em.py'
adding 'hail/experimental/import_gtf.py'
adding 'hail/experimental/interact.py'
adding 'hail/experimental/ld_score_regression.py'
adding 'hail/experimental/ldscore.py'
adding 'hail/experimental/ldscsim.py'
adding 'hail/experimental/lens.py'
adding 'hail/experimental/loop.py'
adding 'hail/experimental/pca.py'
adding 'hail/experimental/phase_by_transmission.py'
adding 'hail/experimental/plots.py'
adding 'hail/experimental/table_ndarray_utils.py'
adding 'hail/experimental/tidyr.py'
adding 'hail/experimental/time.py'
adding 'hail/experimental/write_multiple.py'
adding 'hail/experimental/sparse_mt/__init__.py'
adding 'hail/experimental/sparse_mt/densify.py'
adding 'hail/experimental/sparse_mt/sparse_split_multi.py'
adding 'hail/expr/__init__.py'
adding 'hail/expr/blockmatrix_type.py'
adding 'hail/expr/builders.py'
adding 'hail/expr/functions.py'
adding 'hail/expr/matrix_type.py'
adding 'hail/expr/nat.py'
adding 'hail/expr/table_type.py'
adding 'hail/expr/type_parsing.py'
adding 'hail/expr/types.py'
adding 'hail/expr/aggregators/__init__.py'
adding 'hail/expr/aggregators/aggregators.py'
adding 'hail/expr/expressions/__init__.py'
adding 'hail/expr/expressions/base_expression.py'
adding 'hail/expr/expressions/expression_typecheck.py'
adding 'hail/expr/expressions/expression_utils.py'
adding 'hail/expr/expressions/indices.py'
adding 'hail/expr/expressions/typed_expressions.py'
adding 'hail/fs/__init__.py'
adding 'hail/fs/hadoop_fs.py'
adding 'hail/genetics/__init__.py'
adding 'hail/genetics/call.py'
adding 'hail/genetics/locus.py'
adding 'hail/genetics/pedigree.py'
adding 'hail/genetics/reference_genome.py'
adding 'hail/ggplot/__init__.py'
adding 'hail/ggplot/aes.py'
adding 'hail/ggplot/coord_cartesian.py'
adding 'hail/ggplot/facets.py'
adding 'hail/ggplot/geoms.py'
adding 'hail/ggplot/ggplot.py'
adding 'hail/ggplot/labels.py'
adding 'hail/ggplot/scale.py'
adding 'hail/ggplot/stats.py'
adding 'hail/ggplot/utils.py'
adding 'hail/ir/__init__.py'
adding 'hail/ir/base_ir.py'
adding 'hail/ir/blockmatrix_ir.py'
adding 'hail/ir/blockmatrix_reader.py'
adding 'hail/ir/blockmatrix_writer.py'
adding 'hail/ir/export_type.py'
adding 'hail/ir/ir.py'
adding 'hail/ir/matrix_ir.py'
adding 'hail/ir/matrix_reader.py'
adding 'hail/ir/matrix_writer.py'
adding 'hail/ir/register_aggregators.py'
adding 'hail/ir/register_functions.py'
adding 'hail/ir/renderer.py'
adding 'hail/ir/table_ir.py'
adding 'hail/ir/table_reader.py'
adding 'hail/ir/table_writer.py'
adding 'hail/ir/utils.py'
adding 'hail/linalg/__init__.py'
adding 'hail/linalg/blockmatrix.py'
adding 'hail/linalg/utils/__init__.py'
adding 'hail/linalg/utils/misc.py'
adding 'hail/methods/__init__.py'
adding 'hail/methods/family_methods.py'
adding 'hail/methods/impex.py'
adding 'hail/methods/import_lines_helpers.py'
adding 'hail/methods/misc.py'
adding 'hail/methods/pca.py'
adding 'hail/methods/qc.py'
adding 'hail/methods/statgen.py'
adding 'hail/methods/relatedness/__init__.py'
adding 'hail/methods/relatedness/identity_by_descent.py'
adding 'hail/methods/relatedness/king.py'
adding 'hail/methods/relatedness/mating_simulation.py'
adding 'hail/methods/relatedness/pc_relate.py'
adding 'hail/nd/__init__.py'
adding 'hail/nd/nd.py'
adding 'hail/plot/__init__.py'
adding 'hail/plot/plots.py'
adding 'hail/stats/__init__.py'
adding 'hail/stats/linear_mixed_model.py'
adding 'hail/typecheck/__init__.py'
adding 'hail/typecheck/check.py'
adding 'hail/utils/__init__.py'
adding 'hail/utils/byte_reader.py'
adding 'hail/utils/deduplicate.py'
adding 'hail/utils/frozendict.py'
adding 'hail/utils/genomic_range_table.py'
adding 'hail/utils/hadoop_utils.py'
adding 'hail/utils/interval.py'
adding 'hail/utils/java.py'
adding 'hail/utils/jsonx.py'
adding 'hail/utils/linkedlist.py'
adding 'hail/utils/misc.py'
adding 'hail/utils/placement_tree.py'
adding 'hail/utils/struct.py'
adding 'hail/utils/tutorial.py'
adding 'hail/vds/__init__.py'
adding 'hail/vds/functions.py'
adding 'hail/vds/methods.py'
adding 'hail/vds/variant_dataset.py'
adding 'hail/vds/combiner/__init__.py'
adding 'hail/vds/combiner/combine.py'
adding 'hail/vds/combiner/variant_dataset_combiner.py'
adding 'hailtop/__init__.py'
adding 'hailtop/dictfix.py'
adding 'hailtop/frozendict.py'
adding 'hailtop/hail_frozenlist.py'
adding 'hailtop/hail_logging.py'
adding 'hailtop/hail_version'
adding 'hailtop/httpx.py'
adding 'hailtop/py.typed'
adding 'hailtop/test_utils.py'
adding 'hailtop/timex.py'
adding 'hailtop/tls.py'
adding 'hailtop/yamlx.py'
adding 'hailtop/aiocloud/__init__.py'
adding 'hailtop/aiocloud/aioaws/__init__.py'
adding 'hailtop/aiocloud/aioaws/fs.py'
adding 'hailtop/aiocloud/aioazure/__init__.py'
adding 'hailtop/aiocloud/aioazure/credentials.py'
adding 'hailtop/aiocloud/aioazure/fs.py'
adding 'hailtop/aiocloud/aioazure/session.py'
adding 'hailtop/aiocloud/aioazure/client/__init__.py'
adding 'hailtop/aiocloud/aioazure/client/arm_client.py'
adding 'hailtop/aiocloud/aioazure/client/base_client.py'
adding 'hailtop/aiocloud/aioazure/client/compute_client.py'
adding 'hailtop/aiocloud/aioazure/client/graph_client.py'
adding 'hailtop/aiocloud/aioazure/client/network_client.py'
adding 'hailtop/aiocloud/aioazure/client/pricing_client.py'
adding 'hailtop/aiocloud/aioazure/client/resources_client.py'
adding 'hailtop/aiocloud/aiogoogle/__init__.py'
adding 'hailtop/aiocloud/aiogoogle/credentials.py'
adding 'hailtop/aiocloud/aiogoogle/session.py'
adding 'hailtop/aiocloud/aiogoogle/user_config.py'
adding 'hailtop/aiocloud/aiogoogle/client/__init__.py'
adding 'hailtop/aiocloud/aiogoogle/client/base_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/bigquery_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/billing_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/compute_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/container_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/iam_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/logging_client.py'
adding 'hailtop/aiocloud/aiogoogle/client/storage_client.py'
adding 'hailtop/aiocloud/common/__init__.py'
adding 'hailtop/aiocloud/common/base_client.py'
adding 'hailtop/aiocloud/common/credentials.py'
adding 'hailtop/aiocloud/common/session.py'
adding 'hailtop/aiogoogle/__init__.py'
adding 'hailtop/aiotools/__init__.py'
adding 'hailtop/aiotools/aio_contextlib.py'
adding 'hailtop/aiotools/copy.py'
adding 'hailtop/aiotools/delete.py'
adding 'hailtop/aiotools/diff.py'
adding 'hailtop/aiotools/local_fs.py'
adding 'hailtop/aiotools/router_fs.py'
adding 'hailtop/aiotools/tasks.py'
adding 'hailtop/aiotools/utils.py'
adding 'hailtop/aiotools/weighted_semaphore.py'
adding 'hailtop/aiotools/fs/__init__.py'
adding 'hailtop/aiotools/fs/copier.py'
adding 'hailtop/aiotools/fs/exceptions.py'
adding 'hailtop/aiotools/fs/fs.py'
adding 'hailtop/aiotools/fs/stream.py'
adding 'hailtop/auth/__init__.py'
adding 'hailtop/auth/auth.py'
adding 'hailtop/auth/sql_config.py'
adding 'hailtop/auth/tokens.py'
adding 'hailtop/batch/__init__.py'
adding 'hailtop/batch/backend.py'
adding 'hailtop/batch/batch.py'
adding 'hailtop/batch/batch_pool_executor.py'
adding 'hailtop/batch/conftest.py'
adding 'hailtop/batch/docker.py'
adding 'hailtop/batch/exceptions.py'
adding 'hailtop/batch/globals.py'
adding 'hailtop/batch/hail_genetics_images.py'
adding 'hailtop/batch/job.py'
adding 'hailtop/batch/resource.py'
adding 'hailtop/batch/utils.py'
adding 'hailtop/batch_client/__init__.py'
adding 'hailtop/batch_client/aioclient.py'
adding 'hailtop/batch_client/client.py'
adding 'hailtop/batch_client/globals.py'
adding 'hailtop/batch_client/parse.py'
adding 'hailtop/cleanup_gcr/__init__.py'
adding 'hailtop/cleanup_gcr/__main__.py'
adding 'hailtop/config/__init__.py'
adding 'hailtop/config/deploy_config.py'
adding 'hailtop/config/user_config.py'
adding 'hailtop/fs/__init__.py'
adding 'hailtop/fs/fs.py'
adding 'hailtop/fs/fs_utils.py'
adding 'hailtop/fs/router_fs.py'
adding 'hailtop/fs/stat_result.py'
adding 'hailtop/hailctl/__init__.py'
adding 'hailtop/hailctl/__main__.py'
adding 'hailtop/hailctl/deploy.yaml'
adding 'hailtop/hailctl/describe.py'
adding 'hailtop/hailctl/auth/__init__.py'
adding 'hailtop/hailctl/auth/cli.py'
adding 'hailtop/hailctl/auth/create_user.py'
adding 'hailtop/hailctl/auth/delete_user.py'
adding 'hailtop/hailctl/auth/login.py'
adding 'hailtop/hailctl/batch/__init__.py'
adding 'hailtop/hailctl/batch/batch_cli_utils.py'
adding 'hailtop/hailctl/batch/cli.py'
adding 'hailtop/hailctl/batch/list_batches.py'
adding 'hailtop/hailctl/batch/submit.py'
adding 'hailtop/hailctl/batch/billing/__init__.py'
adding 'hailtop/hailctl/batch/billing/cli.py'
adding 'hailtop/hailctl/config/__init__.py'
adding 'hailtop/hailctl/config/cli.py'
adding 'hailtop/hailctl/dataproc/__init__.py'
adding 'hailtop/hailctl/dataproc/cli.py'
adding 'hailtop/hailctl/dataproc/cluster_config.py'
adding 'hailtop/hailctl/dataproc/connect.py'
adding 'hailtop/hailctl/dataproc/deploy_metadata.py'
adding 'hailtop/hailctl/dataproc/diagnose.py'
adding 'hailtop/hailctl/dataproc/gcloud.py'
adding 'hailtop/hailctl/dataproc/modify.py'
adding 'hailtop/hailctl/dataproc/start.py'
adding 'hailtop/hailctl/dataproc/submit.py'
adding 'hailtop/hailctl/dataproc/utils.py'
adding 'hailtop/hailctl/dev/__init__.py'
adding 'hailtop/hailctl/dev/ci_client.py'
adding 'hailtop/hailctl/dev/cli.py'
adding 'hailtop/hailctl/dev/config.py'
adding 'hailtop/hailctl/hdinsight/__init__.py'
adding 'hailtop/hailctl/hdinsight/cli.py'
adding 'hailtop/hailctl/hdinsight/start.py'
adding 'hailtop/hailctl/hdinsight/submit.py'
adding 'hailtop/utils/__init__.py'
adding 'hailtop/utils/process.py'
adding 'hailtop/utils/rate_limiter.py'
adding 'hailtop/utils/rates.py'
adding 'hailtop/utils/rich_progress_bar.py'
adding 'hailtop/utils/serialization.py'
adding 'hailtop/utils/time.py'
adding 'hailtop/utils/utils.py'
adding 'hailtop/utils/validate/__init__.py'
adding 'hailtop/utils/validate/validate.py'
adding 'hail-0.2.120.dist-info/METADATA'
adding 'hail-0.2.120.dist-info/WHEEL'
adding 'hail-0.2.120.dist-info/entry_points.txt'
adding 'hail-0.2.120.dist-info/top_level.txt'
adding 'hail-0.2.120.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
python3 -m pip install 'pip-tools==6.13.0' && bash ../check_pip_requirements.sh python
Requirement already satisfied: pip-tools==6.13.0 in /usr/local/lib/python3.8/site-packages (6.13.0)
Requirement already satisfied: build in /usr/local/lib/python3.8/site-packages (from pip-tools==6.13.0) (0.10.0)
Requirement already satisfied: click>=8 in /usr/local/lib/python3.8/site-packages (from pip-tools==6.13.0) (8.1.6)
Requirement already satisfied: pip>=22.2 in /usr/local/lib/python3.8/site-packages (from pip-tools==6.13.0) (23.2.1)
Requirement already satisfied: setuptools in /usr/lib/python3.8/site-packages (from pip-tools==6.13.0) (38.4.0)
Requirement already satisfied: wheel in /usr/local/lib/python3.8/site-packages (from pip-tools==6.13.0) (0.41.1)
Requirement already satisfied: packaging>=19.0 in /usr/local/lib/python3.8/site-packages (from build->pip-tools==6.13.0) (23.1)
Requirement already satisfied: pyproject_hooks in /usr/local/lib/python3.8/site-packages (from build->pip-tools==6.13.0) (1.0.0)
Requirement already satisfied: tomli>=1.1.0 in /usr/local/lib/python3.8/site-packages (from build->pip-tools==6.13.0) (2.0.1)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
+ for package in '$@'
+ reqs=python/requirements.txt
+ pinned=python/pinned-requirements.txt
++ mktemp
+ new_pinned=/tmp/tmp.aWUFJ1BMnP
++ mktemp
+ pinned_no_comments=/tmp/tmp.9UoEuYguOd
++ mktemp
+ new_pinned_no_comments=/tmp/tmp.88KxU992pQ
+ PATH=/sbin:/bin:/usr/sbin:/usr/bin:/root/.local/bin
+ pip-compile --quiet python/requirements.txt python/pinned-requirements.txt --output-file=/tmp/tmp.aWUFJ1BMnP
../check_pip_requirements.sh: line 13: pip-compile: command not found
make: *** [check-pip-lockfile] Error 127
danking commented 11 months ago

@daniel-goldstein can you take a peek at this?

danking commented 10 months ago

@mhebrard I notice you're using sudo make, I suspect this means that Hail's code is running under a modified PATH that lacks pip-compile. We'll fix our install-on-cluster target to have a "make the artifact" and an "install" step that are separate (so you can install as root but build as a normal user). In the mean time, apply this patch:

diff --git a/hail/Makefile b/hail/Makefile
index dabe146d3a..e12ac791c4 100644
--- a/hail/Makefile
+++ b/hail/Makefile
@@ -349,7 +349,7 @@ install: $(WHEEL)
        hailctl config set query/backend spark

 .PHONY: install-on-cluster
-install-on-cluster: $(WHEEL) check-pip-lockfile
+install-on-cluster: $(WHEEL)
        sed '/^pyspark/d' python/pinned-requirements.txt | grep -v -e '^[[:space:]]*#' -e '^$$' | tr '\n' '\0' | xargs -0 $(PIP) install -U
        -$(PIP) uninstall -y hail
        $(PIP) install $(WHEEL) --no-deps