Open danielhundhausen opened 9 months ago
(Below are rough notes for fixing the problem.)
The fsspec library shows up in surprising ways, so it should probably become a strict dependency. However, fsspec uses the same trick of "being a lightweight dependency" by requesting other modules as needed—otherwise, it would depend on every remote-protocol library in the universe. If someone tries to call ak.from_json with an s3://
URI, they'll first be asked to install fsspec and then, separately, they'll be asked to install s3fs, which would be annoying.
So as a policy decision, let's make fsspec a strict dependency, so that users only get a request to install things once. (Unless they're using ak.to_parquet like @danielhundhausen and get asked to install pyarrow and s3fs or whatever. Sorry!)
The other runtime dependencies should remain runtime dependencies, since they only affect small sets of functions in a logical way (ak.to_arrow requires pyarrow, etc.). @agoose77 and I came up with a way to include this information in the ak._dispatch.high_level_function
decorator so that it can be added to the documentation and tested for upfront in a way that only specifies the information (which functions depend on which libraries) in one place.
From a grep, below are all of the non-stdlib, non-dependency imports in src/awkward (some have import_*
helper functions). We'll be able to take fsspec off the list when it becomes a strict dependency for Awkward. (Targeting version 2.6.0 on February 1, 2024, since a new strict dependency needs a new minor version.) Some of the dependencies aren't confined to one ak.*
function because they're used to implement something like a backend or for passing data into cppyy or Numba, which can only happen if you've already imported cppyy or Numba. Some of these imports are through helper functions (import_*
) that provide the "you need version x.y.z" error message.
ak.Array.cpptype
, ak.Array.__cast_cpp__
, ak.from_rdataframe, ak.to_rdataframeimport_cupy
: backend="cuda"import_fsspec
: ak.from_json, ak.to_json for non-local URIs as filenames, ak.from_parquetimport_jax
: backend="jax"_import_numexpr
: awkward.numexpr
is (currently) dead code, may be re-added somedayimport_pyarrow
, import_pyarrow_compute
, import_pyarrow_parquet
: ak.to_arrow, ak.to_arrow_table, ak.from_arrow, ak.to_feather, ak.from_feather, ak.to_parquet, ak.from_parquetak.Array
, inside if hasattr(builtins, "__IPYTHON__")
, intended to protect properties from IPython's meddlingUnique grep results, indicating which files they were found in:
src/awkward/_connect/cuda/__init__.py: import cupy src/awkward/_connect/jax/__init__.py:import jax.numpy src/awkward/_connect/jax/reducers.py:import jax src/awkward/_connect/jax/trees.py:import jax src/awkward/_connect/numba/arrayview_cuda.py:from numba.core.errors import NumbaTypeError src/awkward/_connect/numba/arrayview_cuda.py:import numba src/awkward/_connect/numba/arrayview.py:from numba.core.errors import NumbaTypeError src/awkward/_connect/numba/arrayview.py:import numba src/awkward/_connect/numba/arrayview.py:import numba.core.typing src/awkward/_connect/numba/arrayview.py:import numba.core.typing.ctypes_utils src/awkward/_connect/numba/builder.py:from numba.core.errors import NumbaTypeError src/awkward/_connect/numba/builder.py: import llvmlite.ir.types src/awkward/_connect/numba/builder.py:import numba src/awkward/_connect/numba/builder.py:import numba.core.typing src/awkward/_connect/numba/builder.py:import numba.core.typing.ctypes_utils src/awkward/_connect/numba/growablebuffer.py:import numba src/awkward/_connect/numba/growablebuffer.py:import numba.core.typing.npydecl src/awkward/_connect/numba/layoutbuilder.py:from numba.core.errors import NumbaTypeError src/awkward/_connect/numba/layoutbuilder.py:import numba src/awkward/_connect/numba/layoutbuilder.py:import numba.core.typing.npydecl src/awkward/_connect/numba/layout.py:from numba.core.errors import NumbaTypeError, NumbaValueError src/awkward/_connect/numba/layout.py:import llvmlite.ir src/awkward/_connect/numba/layout.py: import llvmlite.ir.types src/awkward/_connect/numba/layout.py:import numba src/awkward/_connect/numexpr.py: import numexpr src/awkward/_connect/pyarrow.py: import fsspec src/awkward/_connect/pyarrow.py: import pyarrow src/awkward/_connect/pyarrow.py: import pyarrow.compute as out src/awkward/_connect/pyarrow.py: import pyarrow.parquet as out src/awkward/_connect/rdataframe/from_rdataframe.py:import cppyy src/awkward/_connect/rdataframe/from_rdataframe.py:import ROOT src/awkward/_connect/rdataframe/to_rdataframe.py:import ROOT src/awkward/cppyy.py: import cppyy src/awkward/highlevel.py: from IPython.utils.wildcard import dict_dir src/awkward/highlevel.py: import cppyy src/awkward/highlevel.py: import numba src/awkward/jax.py: import jax # noqa: TID251 src/awkward/jax.py: import jax # noqa: TID251, F401 src/awkward/numba/__init__.py: import numba src/awkward/numba/__init__.py: import numba src/awkward/numba/layoutbuilder.py: import numba src/awkward/operations/ak_from_feather.py: import pyarrow.feather src/awkward/operations/ak_from_json.py: import fsspec src/awkward/operations/ak_from_parquet.py: import fsspec.parquet src/awkward/operations/ak_from_parquet.py: import pyarrow.parquet as pyarrow_parquet src/awkward/operations/ak_to_dataframe.py: import pandas src/awkward/operations/ak_to_feather.py: import pyarrow.feather src/awkward/operations/ak_to_json.py: import fsspec src/awkward/types/_awkward_datashape_parser.py: from .lexer import Token src/awkward/_backends/cupy.py: cupy = cuda.import_cupy("Awkward Arrays with CUDA") src/awkward/_connect/cuda/__init__.py: cupy = import_cupy("Awkward Arrays with CUDA") src/awkward/_connect/cuda/__init__.py:def import_cupy(name="Awkward Arrays with CUDA"): src/awkward/_connect/cuda/_kernel_signatures.py:cupy = import_cupy("Awkward Arrays with CUDA") src/awkward/_connect/cuda/_kernel_signatures.py:from awkward._connect.cuda import import_cupy src/awkward/_connect/numexpr.py: numexpr = _import_numexpr() src/awkward/_connect/pyarrow.py: import_pyarrow_parquet(name) src/awkward/contents/content.py: pyarrow = awkward._connect.pyarrow.import_pyarrow("to_arrow") src/awkward/jax.py:def import_jax(): src/awkward/_kernels.py: cupy = ak_cuda.import_cupy("Awkward Arrays with CUDA") src/awkward/_nplikes/cupy.py: self._module = ak._connect.cuda.import_cupy("Awkward Arrays with CUDA") src/awkward/_nplikes/jax.py: jax = ak.jax.import_jax() src/awkward/operations/ak_from_parquet.py: pyarrow_parquet = awkward._connect.pyarrow.import_pyarrow_parquet("ak.from_parquet") src/awkward/operations/ak_to_parquet.py: fsspec = awkward._connect.pyarrow.import_fsspec("ak.to_parquet") src/awkward/operations/ak_to_parquet.py: pyarrow_parquet = awkward._connect.pyarrow.import_pyarrow_parquet("ak.to_parquet") src/awkward/operations/str/akstr_capitalize.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_capitalize.py: pc = import_pyarrow_compute("ak.str.capitalize") src/awkward/operations/str/akstr_center.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_center.py: pc = import_pyarrow_compute("r") src/awkward/operations/str/akstr_count_substring.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_count_substring.py: pc = import_pyarrow_compute("ak.str.count_substring") src/awkward/operations/str/akstr_count_substring_regex.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_count_substring_regex.py: pc = import_pyarrow_compute("ak.str.count_substring_regex") src/awkward/operations/str/akstr_ends_with.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_ends_with.py: pc = import_pyarrow_compute("h") src/awkward/operations/str/akstr_extract_regex.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_extract_regex.py: pc = import_pyarrow_compute("x") src/awkward/operations/str/akstr_find_substring.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_find_substring.py: pc = import_pyarrow_compute("ak.str.find_substring") src/awkward/operations/str/akstr_find_substring_regex.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_find_substring_regex.py: pc = import_pyarrow_compute("ak.str.find_substring_regex") src/awkward/operations/str/akstr_index_in.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_index_in.py: pc = import_pyarrow_compute("ak.str.index_in") src/awkward/operations/str/akstr_is_alnum.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_alnum.py: pc = import_pyarrow_compute("m") src/awkward/operations/str/akstr_is_alpha.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_alpha.py: pc = import_pyarrow_compute("a") src/awkward/operations/str/akstr_is_ascii.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_ascii.py: pc = import_pyarrow_compute("i") src/awkward/operations/str/akstr_is_decimal.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_decimal.py: pc = import_pyarrow_compute("l") src/awkward/operations/str/akstr_is_digit.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_digit.py: pc = import_pyarrow_compute("t") src/awkward/operations/str/akstr_is_in.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_in.py: pc = import_pyarrow_compute("ak.str.is_in") src/awkward/operations/str/akstr_is_lower.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_lower.py: pc = import_pyarrow_compute("r") src/awkward/operations/str/akstr_is_numeric.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_numeric.py: pc = import_pyarrow_compute("c") src/awkward/operations/str/akstr_is_printable.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_printable.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_is_space.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_space.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_is_title.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_title.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_is_upper.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_is_upper.py: pc = import_pyarrow_compute("r") src/awkward/operations/str/akstr_join_element_wise.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_join_element_wise.py: pc = import_pyarrow_compute("ak.str.join_element_wise") src/awkward/operations/str/akstr_join.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_join.py: pc = import_pyarrow_compute("ak.str.join") src/awkward/operations/str/akstr_length.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_length.py: pc = import_pyarrow_compute("h") src/awkward/operations/str/akstr_lower.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_lower.py: pc = import_pyarrow_compute("r") src/awkward/operations/str/akstr_lpad.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_lpad.py: pc = import_pyarrow_compute("d") src/awkward/operations/str/akstr_ltrim.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_ltrim.py: pc = import_pyarrow_compute("m") src/awkward/operations/str/akstr_ltrim_whitespace.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_ltrim_whitespace.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_match_like.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_match_like.py: pc = import_pyarrow_compute("ak.str.match_like") src/awkward/operations/str/akstr_match_substring.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_match_substring.py: pc = import_pyarrow_compute("ak.str.match_substring") src/awkward/operations/str/akstr_match_substring_regex.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_match_substring_regex.py: pc = import_pyarrow_compute("x") src/awkward/operations/str/akstr_repeat.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_repeat.py: pc = import_pyarrow_compute("ak.str.repeat") src/awkward/operations/str/akstr_replace_slice.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_replace_slice.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_replace_substring.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_replace_substring.py: pc = import_pyarrow_compute("g") src/awkward/operations/str/akstr_replace_substring_regex.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_replace_substring_regex.py: pc = import_pyarrow_compute("x") src/awkward/operations/str/akstr_reverse.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_reverse.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_rpad.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_rpad.py: pc = import_pyarrow_compute("d") src/awkward/operations/str/akstr_rtrim.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_rtrim.py: pc = import_pyarrow_compute("m") src/awkward/operations/str/akstr_rtrim_whitespace.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_rtrim_whitespace.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_slice.py: pc = import_pyarrow_compute("ak.str.slice") src/awkward/operations/str/akstr_split_pattern.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_split_pattern.py: pc = import_pyarrow_compute("ak.str.split_pattern") src/awkward/operations/str/akstr_split_pattern_regex.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_split_pattern_regex.py: pc = import_pyarrow_compute("ak.str.split_pattern_regex") src/awkward/operations/str/akstr_split_whitespace.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_split_whitespace.py: pc = import_pyarrow_compute("ak.str.split_whitespace") src/awkward/operations/str/akstr_starts_with.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_starts_with.py: pc = import_pyarrow_compute("ak.str.starts_with") src/awkward/operations/str/akstr_swapcase.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_swapcase.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_title.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_title.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_to_categorical.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_to_categorical.py: pc = import_pyarrow_compute("ak.str.to_categorical") src/awkward/operations/str/akstr_trim.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_trim.py: pc = import_pyarrow_compute("m") src/awkward/operations/str/akstr_trim_whitespace.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_trim_whitespace.py: pc = import_pyarrow_compute("e") src/awkward/operations/str/akstr_upper.py: from awkward._connect.pyarrow import import_pyarrow_compute src/awkward/operations/str/akstr_upper.py: pc = import_pyarrow_compute("r")
Which documentation?
Other (please explain)?
What needs to be documented?
When setting up a fresh installation of
awkward@2.5.2
(and earlier) and trying to runthe dependencies
pyarrow
,fsspec
andpandas
have to be installed by hand first for this minimal example to work. As far as I see this is not in the documentation. If this is the desired behaviour to keep the footprint ofawkward
small if the user does not want to use this function, I suggest to add a section to the docs explaining the necessary dependencies. If this is not the desired behaviour it would be convenient to add the mentioned dependencies in thepyproject.toml
.Thanks for considering!