issues
search
tensorflow
/
mesh
Mesh TensorFlow: Model Parallelism Made Easier
Apache License 2.0
1.58k
stars
254
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Error while importing Meshtensorflow
#396
billygrahamram
closed
9 months ago
0
Migrate references and remove legacy target tpu:tpu_estimator.
#395
copybara-service[bot]
closed
10 months ago
0
Update attention.py
#394
sjw8793
opened
1 year ago
1
Optimizer momentums not properly populated training model with DTensors
#393
pentney
closed
1 year ago
1
AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'register_tensor_conversion_function'
#392
Xnhyacinth
closed
1 year ago
4
Does load-balanced loss help the loss converge?
#391
mathfinder
opened
1 year ago
0
Move `convert_to_tensor`, `convert_to_tensor_v1`, `convert_to_tensor_v1_with_dispatch`, `convert_to_tensor_v2_with_dispatch`, and `convert_to_tensor_v2` into `tensor_conversion_registry`.
#389
copybara-service[bot]
opened
1 year ago
0
feat(ci): enable `pip` caching in CI
#388
SauravMaheshkar
closed
1 year ago
1
Remove legacy references from `ops.py`.
#387
copybara-service[bot]
closed
1 year ago
0
Remove legacy references from `ops.py`.
#386
copybara-service[bot]
closed
1 year ago
0
Fix docstring typos
#385
copybara-service[bot]
closed
2 years ago
1
Enable multi-file inference
#384
copybara-service[bot]
closed
2 years ago
1
When running BERT on GPU: Resource exhausted: failed to allocate memory
#383
Currycurrycurry
opened
2 years ago
1
Internal change
#382
copybara-service[bot]
closed
2 years ago
1
Make mesh_tensorflow's call of `get_replicated_var_handle` backward-compatible with tf <= 2.8.0. Fixes https://github.com/google-research/text-to-text-transfer-transformer/issues/1020.
#381
copybara-service[bot]
closed
2 years ago
0
bump version number to release updated PyPI package that includes last year enhancements
#380
copybara-service[bot]
closed
2 years ago
0
Getting "NanLossDuringTrainingError: NaN loss during training."
#379
dhruval-p
opened
2 years ago
0
mask_1_flat and mask_2_flat applied to gates twice?
#378
marhlder
opened
2 years ago
0
Explicitly import estimator from tensorflow as a separate import instead of accessing it via tf.estimator and depend on the tensorflow estimator target.
#377
copybara-service[bot]
closed
2 years ago
0
Remove unused comments related to Python 2 compatibility.
#376
copybara-service[bot]
closed
2 years ago
0
Make TPU variable name deterministic.
#375
copybara-service[bot]
closed
2 years ago
0
Adding a new Gradient Estimator for Routing using REINFORCE with a leave-one-out baseline.
#374
copybara-service[bot]
opened
2 years ago
0
#HyperPrompt Part 2 of HyperPrompt implementation: the actual computation of HyperPrompt inside self-attention layer.
#373
copybara-service[bot]
closed
2 years ago
0
Use math.gcd instead of fractions.gcd, the former is deprecated in Python 3.5 and removed in 3.9.
#372
copybara-service[bot]
closed
2 years ago
0
Split out optimizer call for internal purposes.
#371
copybara-service[bot]
closed
2 years ago
0
fix typo in logging statement.
#370
copybara-service[bot]
closed
2 years ago
0
About the mixture of expert model
#369
fym0503
opened
2 years ago
0
Mesh-tf model conversion to onnx?
#368
b-analyst
opened
2 years ago
2
Minor comment fix to refer to the correct argument name.
#367
copybara-service[bot]
opened
2 years ago
0
Make sure gates are not normalized for n=1 for top_n routing
#366
copybara-service[bot]
closed
2 years ago
3
Fix some example code in readme for einsum operation
#365
baragona
opened
2 years ago
2
How to freeze embedding layers
#364
lintangsutawika
opened
3 years ago
0
Add a link to the Primer paper
#363
copybara-service[bot]
closed
3 years ago
4
Beam search
#362
antonio-mastropaolo
opened
3 years ago
0
Output raw model outputs during eval
#361
craffel
opened
3 years ago
0
Add utility to save score predictions to TFRecords for scoring large datasets.
#360
copybara-service[bot]
closed
3 years ago
0
Save scores lazily.
#359
copybara-service[bot]
opened
3 years ago
0
Remove unnecessary name and cwise in squared relu.
#358
copybara-service[bot]
closed
3 years ago
0
Expert Attention Fixes:
#357
copybara-service[bot]
closed
3 years ago
3
Squared ReLU from Primer paper.
#356
copybara-service[bot]
closed
3 years ago
0
Internal
#355
copybara-service[bot]
closed
3 years ago
18
Remove dataset checkpoint policy override now that b/181765832 is resolved.
#354
copybara-service[bot]
closed
3 years ago
0
Add more extensive top-2 logging.
#353
copybara-service[bot]
closed
3 years ago
3
Ability to add Custom Tensorflow Hooks
#352
trisongz
opened
3 years ago
0
Only add z_loss to losses if during training.
#351
copybara-service[bot]
closed
3 years ago
3
Expert Attention Fixes:
#350
copybara-service[bot]
closed
3 years ago
3
Fix bug in shared_kv attention for autoregressive decoding.
#349
copybara-service[bot]
closed
3 years ago
2
Change second d_model_split dim's size to be the output shape, instead of input shape. This allows it to work for layers where the input size is different than the output size.
#348
copybara-service[bot]
closed
3 years ago
3
heterogeneous mixture of experts layer
#347
copybara-service[bot]
closed
3 years ago
5
Add more options to Experts Attention. These options remove 1/3 of the all2all communication costs:
#346
copybara-service[bot]
closed
3 years ago
2
Next