google / orbax

Orbax provides common checkpointing and persistence utilities for JAX users
https://orbax.readthedocs.io/
Apache License 2.0
301 stars 36 forks source link

Dependency array-record Not Available on ARM64 Architecture #607

Closed BeeGass closed 11 months ago

BeeGass commented 11 months ago

Issue

I am currently using Orbax for checkpointing in my project, which is being developed on an ARM64 architecture (Apple Silicon). Orbax has a transitive dependency on array-record through etils. However, array-record does not provide a version compatible with ARM64, leading to installation issues when using Poetry.

Environment

Orbax version: [0.1.9] Python version: [3.9.13] Operating System: macOS on ARM64 (Apple Silicon) Dependency management: Poetry

Steps to Reproduce

  1. Include orbax as a dependency in a Python project managed with Poetry.
  2. Run poetry install on an ARM64 Mac.

    
    • Installing array-record (0.5.0): Failed
    
    RuntimeError
    
    Unable to find installation candidates for array-record (0.5.0)
    
    at ~/Library/Application Support/pypoetry/venv/lib/python3.8/site-packages/poetry/installation/chooser.py:73 in choose_for
       69│
       70│             links.append(link)
       71│
       72│         if not links:
    →  73│             raise RuntimeError(f"Unable to find installation candidates for {package}")
       74│
       75│         # Get the best link
       76│         chosen = max(links, key=lambda link: self._sort_key(package, link))
       77│

Cannot install array-record.

3. Observe that the installation fails due to array-record not being available for ARM64.

## Connection between Orbax and array-record
```bash
> poetry show array-record --tree
array-record 0.5.0 A file format that achieves a new frontier of IO efficiency
├── absl-py *
└── etils *
    ├── absl-py *
    ├── fsspec *
    ├── importlib-resources *
    │   └── zipp >=3.1.0
    ├── numpy *
    ├── tqdm *
    │   └── colorama *
    ├── typing-extensions *
    └── zipp * (circular dependency aborted here)
> poetry show orbax --tree
orbax 0.1.9 Orbax
└── orbax-checkpoint >=0.1.8
    ├── absl-py *
    ├── etils *
    │   ├── absl-py * (circular dependency aborted here)
    │   ├── fsspec *
    │   ├── importlib-resources *
    │   │   └── zipp >=3.1.0
    │   ├── numpy *
    │   ├── tqdm *
    │   │   └── colorama *
    │   ├── typing-extensions *
    │   └── zipp * (circular dependency aborted here)
    ├── jax >=0.4.9
    │   ├── importlib-metadata >=4.6
    │   │   └── zipp >=0.5 (circular dependency aborted here)
    │   ├── ml-dtypes >=0.1.0
    │   │   ├── numpy >=1.21.2 (circular dependency aborted here)
    │   │   └── numpy >1.20 (circular dependency aborted here)
    │   ├── numpy >=1.21 (circular dependency aborted here)
    │   ├── opt-einsum *
    │   │   └── numpy >=1.7 (circular dependency aborted here)
    │   └── scipy >=1.7
    │       └── numpy >=1.21.6,<1.28.0 (circular dependency aborted here)
    ├── jaxlib *
    │   ├── ml-dtypes >=0.1.0 (circular dependency aborted here)
    │   ├── numpy >=1.21 (circular dependency aborted here)
    │   └── scipy >=1.7 (circular dependency aborted here)
    ├── msgpack *
    ├── nest-asyncio *
    ├── numpy * (circular dependency aborted here)
    ├── protobuf *
    ├── pyyaml *
    ├── tensorstore >=0.1.35
    │   └── numpy >=1.16.0 (circular dependency aborted here)
    └── typing-extensions * (circular dependency aborted here)

Extra

The issue lies within etils using array-record and array-record doesnt have a distribution for the ARM64 architecture so it fails, https://pypi.org/project/array-record/#files

Poetry

Here is my .toml for reproducability:

[tool.poetry.dependencies]
python = ">=3.9.0,<=3.9.18"
jaxtyping = "^0.2.11"

[tool.poetry.group.mltools]
optional = true

[tool.poetry.group.mltools.dependencies]
numpy = "^1.23.1"
scipy = "^1.9.0"
einops = "^0.5.0"
hydra-core = "^1.2.0"
omegaconf = "^2.2.3"
wandb = "^0.13.5"

[tool.poetry.group.dataset]
optional = true

[tool.poetry.group.dataset.dependencies]
tensorflow-macos = {version = "^2.12.0", platform = "darwin"}
tensorflow-datasets = "^4.7.0"

[tool.poetry.group.torch]
optional = true

[tool.poetry.group.torch.dependencies]
torch = "^1.13.1"
torchvision = "^0.14.1"
functorch = "^1.13.1"

[tool.poetry.group.jax]
optional = true

[tool.poetry.group.jax.dependencies]
jax-metal = { version = "^0.0.4", markers = "platform_machine == 'arm64'" }
flax = "^0.5.2"
optax = "^0.1.3"
orbax = "^0.1.9"

[tool.poetry.group.jupyter]
optional = true

[tool.poetry.group.jupyter.dependencies]
notebook = "^6.4.12"
jupyter = "^1.0.0"
ipykernel = "^6.15.1"
ipython = "^8.4.0"
requests = "^2.31.0"

[tool.poetry.group.additional]
optional = true

[tool.poetry.group.additional.dependencies]
black = {extras = ["jupyter"], version = "^22.6.0"}
pre-commit = "^2.20.0"
pytest = "^7.1.3"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
BeeGass commented 11 months ago

I understand that this may be more appropriate over at the etils codebase but I was wondering how essential etils is. Would it be enough to use the built in pathlib library and use Path() when a directory is needed?

BeeGass commented 11 months ago

Realized the issue wasnt orbax and made an issue directly with array-record. Sorry for the misunderstanding. https://github.com/google/array_record/issues/85