issues
search
google-research
/
t5x
Apache License 2.0
2.65k
stars
301
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Rename `is_supported_empty_aggregation_type` and `is_supported_aggregation_type` functions.
#1595
copybara-service[bot]
closed
3 weeks ago
0
Remove external direct references to `pytree_checkpoint_handler`.
#1594
copybara-service[bot]
closed
1 month ago
0
Move all `CheckpointHandler`s and associated code to `_src/handlers/`. A `handlers.py` file is added to define publicly available symbols.
#1593
copybara-service[bot]
closed
3 weeks ago
0
Remove external references to `pytree_checkpoint_handler`.
#1592
copybara-service[bot]
closed
3 weeks ago
0
`aggregate` option cleanup.
#1591
copybara-service[bot]
closed
1 month ago
0
Fix typo.
#1590
copybara-service[bot]
closed
1 month ago
0
Generate tokenizer-specific warmup examples during variable-length export.
#1589
copybara-service[bot]
closed
1 month ago
1
Split out a thin version of checkpoints.py that does not depend on JAX
#1588
copybara-service[bot]
closed
1 month ago
1
Allow generating warmup data for any tokenizer type.
#1587
copybara-service[bot]
closed
1 month ago
1
Fixing Issue #1353
#1586
copybara-service[bot]
closed
2 months ago
0
Who's where about street life Gun's go boom yo you needed fight for your love and Self-centered mind Jugger notif ?
#1585
Kenshin301
opened
2 months ago
3
[numpy] Fix users of NumPy APIs that are removed in NumPy 2.0.
#1584
copybara-service[bot]
closed
2 months ago
0
remove inlined jax.nn.initializers definitions, resolving TODO of levskaya et al
#1583
copybara-service[bot]
closed
2 months ago
0
Upgrade to python:3.10
#1582
copybara-service[bot]
closed
2 months ago
0
add `wait_until_finished` under checkpoint_manager to fix issue when checkpoint was not completed.
#1581
copybara-service[bot]
opened
2 months ago
0
Stop writing msgpack file for new checkpoints and update empty nodes handling so that it no longer depends on this file.
#1580
copybara-service[bot]
closed
2 months ago
0
When training, many libraries are incompatible due to version issues
#1579
llloopy
opened
2 months ago
0
Sort devices by their implicit order instead of explicitly by id. IDs may be randomly generated, so it's better to rely on the implicit order, which is currently based on (process index, id).
#1578
copybara-service[bot]
opened
2 months ago
1
Stop writing msgpack file for new checkpoints and update empty nodes handling so that it no longer depends on this file.
#1576
copybara-service[bot]
closed
2 months ago
0
OrbaxCheckpointer Error in Distributed Training
#1575
theyorubayesian
closed
2 months ago
1
Prevent divide by zero error in loss computation.
#1574
copybara-service[bot]
closed
3 months ago
1
Getting repetitions after pre-training
#1573
tonyv
opened
3 months ago
0
Set `aggregate=False` under t5x.
#1572
copybara-service[bot]
closed
3 months ago
0
Revert `get_local_data` to original implementation.
#1571
copybara-service[bot]
opened
3 months ago
0
Set `aggregate` to False under SaveArgs within t5x.
#1570
copybara-service[bot]
closed
3 months ago
0
Stop writing msgpack file for new checkpoints and update empty nodes handling so that it no longer depends on this file.
#1569
copybara-service[bot]
closed
3 months ago
0
Fix bug in T5X CheckpointManager argument propagation when SaveBestCheckpoint is set through gin.
#1568
copybara-service[bot]
opened
3 months ago
1
Qlora with flan t5 issue - ValueError: Trying to set a tensor of shape torch.Size([4096, 4096])
#1567
JhonDan1999
opened
4 months ago
0
Propagate metric_name_to_monitor to Orbax CheckpointManager.
#1566
copybara-service[bot]
opened
4 months ago
0
remove dependency on old protobuf
#1565
copybara-service[bot]
closed
4 months ago
0
Make OrbaxCheckpointManagerInterface gin.configurable.
#1564
copybara-service[bot]
closed
4 months ago
0
depend on orbax-checkpoint >= 0.5
#1563
copybara-service[bot]
closed
4 months ago
0
internal
#1562
copybara-service[bot]
closed
4 months ago
0
Set `aggregate` to False under SaveArgs within t5x.
#1561
copybara-service[bot]
closed
3 months ago
0
Enable ocdbt in t5x by default.
#1560
copybara-service[bot]
closed
2 months ago
0
Fix sync_global_devices issue.
#1559
copybara-service[bot]
closed
4 months ago
0
End support for legacy Pax formats with directory structure:
#1558
copybara-service[bot]
closed
4 months ago
0
Replace deprecated `jax.tree_*` functions with `jax.tree.*`
#1557
copybara-service[bot]
closed
4 months ago
0
Delegate to BasePyTreeCheckpointHandler rather than inheriting from it.
#1556
copybara-service[bot]
closed
4 months ago
0
Fix UnboundLocalError: local variable 'checkpoint_manager' referenced before assignment.
#1555
copybara-service[bot]
closed
4 months ago
0
Replace deprecated `jax.tree_*` functions with `jax.tree.*`
#1554
copybara-service[bot]
closed
4 months ago
0
madlad often hallucinates while translating
#1553
dumengnan
opened
4 months ago
0
Make CheckpointManagerConstructor gin-configurable.
#1552
copybara-service[bot]
closed
5 months ago
0
change checkpointing to be compatible with new serialization
#1551
copybara-service[bot]
closed
5 months ago
1
Error trying to import t5x
#1550
osamja
closed
4 months ago
5
Fixes typo in the documentation for `predict_batch_with_aux`.
#1549
copybara-service[bot]
closed
5 months ago
0
Increase t5x AsyncCheckpointer timeout to 600s.
#1548
copybara-service[bot]
closed
5 months ago
0
Use newer Orbax CheckpointManager API in T5X. This change is expected to be a no-op. Also removes a test case that is already covered in Orbax core.
#1547
copybara-service[bot]
closed
5 months ago
0
[train.py] UnboundLocalError: local variable 'checkpoint_manager' referenced before assignment
#1546
gpupuck
closed
4 months ago
2
Add ability to use checkpoints with non-standard directory name prefix.
#1545
copybara-service[bot]
closed
5 months ago
1
Next