issues
search
google
/
orbax
Orbax provides common checkpointing and persistence utilities for JAX users
https://orbax.readthedocs.io/
Apache License 2.0
304
stars
36
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Internal
#1352
copybara-service[bot]
closed
6 days ago
1
Internal change.
#1351
copybara-service[bot]
closed
1 week ago
0
Internal
#1350
copybara-service[bot]
opened
1 week ago
0
Add support for legacy metadata file path in CheckpointManager.
#1349
copybara-service[bot]
closed
1 week ago
0
Internal change.
#1348
copybara-service[bot]
closed
1 week ago
0
Consider NamedTuples as concrete class and never empty.
#1347
copybara-service[bot]
closed
5 days ago
0
Refactor and restructure constructs from `type_handlers.py` and `metadata`
#1346
copybara-service[bot]
closed
5 days ago
0
[Experimental Feature] Support `NamedTuple` and `Tuple` nodes in PyTree metadata.
#1345
copybara-service[bot]
closed
5 days ago
0
Wrap `tree.utils.serialize_tree` -> `metadata.tree.serialize_tree` to allow `NamedTuple` and `Tuple` node types in PyTree metadata.
#1344
copybara-service[bot]
closed
1 week ago
0
Move `Checkpointer` implementations to `_src`.
#1343
copybara-service[bot]
closed
6 days ago
0
Checking changes back in with a bug fix.
#1341
copybara-service[bot]
closed
1 week ago
1
Add chex as an optional testing dependency to unbreak RTD. Unclear why this has not previously been a problem.
#1340
copybara-service[bot]
closed
1 week ago
0
[emergency checkpoint] Add `ReplicatorCheckpointManager` implementation for interoperating with replicator service provided by GKE (or theoretically, any other similar service).
#1339
copybara-service[bot]
closed
3 days ago
0
Internal change.
#1338
copybara-service[bot]
closed
1 week ago
0
Orbax Checkpointing and Flax.NNX require hacking to work together
#1337
hdrwilkinson
opened
1 week ago
1
Rolling back previous changes due to a breakage.
#1336
copybara-service[bot]
closed
1 week ago
1
Add user-facing Root/StepMetadata classes and internal serialization modules.
#1335
copybara-service[bot]
closed
4 days ago
0
Updated Version
#1334
copybara-service[bot]
closed
1 week ago
1
Remove locking and its dependencies.
#1333
copybara-service[bot]
closed
1 week ago
0
Break out mesh construction and process ID metadata utils into a separate file.
#1331
copybara-service[bot]
closed
1 week ago
0
Submission of https://github.com/google/orbax/pull/1319. All credit and thanks to https://github.com/gspschmid.
#1330
copybara-service[bot]
closed
1 week ago
0
Add Layout support
#1329
copybara-service[bot]
closed
1 week ago
0
Add MultiProcessTest base.
#1328
copybara-service[bot]
opened
1 week ago
0
Fix device to HBM mapping for Trillium.
#1327
copybara-service[bot]
closed
1 week ago
0
Custom TypeHandler and "No per-process OCDBT checkpoint subdirs" warning
#1326
PhilipVinc
opened
1 week ago
2
Internal change.
#1325
copybara-service[bot]
closed
1 week ago
0
Manual submit of https://github.com/google/orbax/pull/1323.
#1324
copybara-service[bot]
closed
1 week ago
0
Resubmit single array error message
#1323
cpgaffney1
opened
1 week ago
0
Resubmit of PR1304.
#1322
cpgaffney1
closed
1 week ago
0
Refactor metadata/tree_test.py and move common test types to `test_tree_utils.py` for better reusability.
#1321
copybara-service[bot]
closed
1 week ago
0
[replica-parallel] Add replica-parallel saving
#1320
gspschmid
closed
5 days ago
10
[replica-parallel] Add replica slices concept
#1319
gspschmid
closed
1 week ago
4
Fix readthedoc build issue
#1318
copybara-service[bot]
closed
2 weeks ago
0
Add validation to prevent loading an array index that was never written to.
#1317
copybara-service[bot]
closed
4 days ago
0
Add a distinct shape/dtype struct type for Numpy arrays.
#1316
copybara-service[bot]
closed
1 week ago
0
Refactor `*pytree*handler.py` to improve readability of metadata/tree api usage.
#1315
copybara-service[bot]
closed
2 weeks ago
0
Latest orbax-export release is incompatible with latest orbax-checkpoint release
#1314
jakevdp
opened
2 weeks ago
1
Refactor metadata/tree_test.py to make var names accurate.
#1313
copybara-service[bot]
closed
2 weeks ago
0
internal change
#1312
copybara-service[bot]
opened
2 weeks ago
0
Internal
#1311
copybara-service[bot]
closed
2 weeks ago
0
Fix local restore by re-mapping device ids directly instead of inferring them from how process indexes changed across restarts with some false assumptions.
#1310
copybara-service[bot]
opened
2 weeks ago
1
_validate_params fails on zero-sized arrays
#1309
hrbigelow
opened
2 weeks ago
1
Fix Orbax readthedoc build
#1308
copybara-service[bot]
closed
2 weeks ago
0
Fix mesh construction by remapping device IDs.
#1307
copybara-service[bot]
closed
2 weeks ago
0
Emergency checkpoint: compile broadcast function once at init.
#1306
copybara-service[bot]
closed
2 weeks ago
0
Restoring flax model checkpoints using orbax throws ValueError
#1305
ybangaru
opened
2 weeks ago
3
provide useful error message for single arrays
#1304
garymm
closed
2 weeks ago
1
Emergency checkpoint: use JAX for global_max and combine multiple broadcasts
#1303
copybara-service[bot]
closed
2 weeks ago
0
Introduce `CheckpointManagerOptions.should_keep_fn` as an alternative to Introduce `CheckpointManagerOptions.keep_period`.
#1302
copybara-service[bot]
closed
2 weeks ago
0
Fix emergency checkpointing issue arising when repeatedly restoring from a local checkpoint. Process ID remapping may happen in different ways on subsequent times, so we need to ensure new process metadata is saved with every checkpoint, so we can recover the process ID mapping used to save that checkpoint.
#1301
copybara-service[bot]
closed
1 week ago
0
Previous
Next