-
### Describe the bug
using the train_dreambooth.py script, when I add flags for enabling xformers and set_grads_to_none, the following error happened: train_dreambooth.py: error: unrecognized argum…
-
I'm interested in implementing the[ CrossQ critic update](https://aditya.bhatts.org/CrossQ/), which only requires two Q networks, and no target networks. This could speed up TDMPC2 a decent amount. A …
edwhu updated
3 months ago
-
I want to use gradients to monitor if the model is training properly, like this
I change the `transformers.Trainer` https://github.com/huggingface/transformers/blob/main/src/transformers/train…
-
**Feature request:** Add Optimistic Adam, an [optimistic](https://optax.readthedocs.io/en/latest/api/optimizers.html#optax.optimistic_gradient_descent) variant of [Adam](https://optax.readthedocs.io/e…
-
Hi,
Trying TTQ on RESNET18 but getting a runtime error. Can't seem to find what the issue is:
/home/user2/Desktop/pttq/resnet_caltech/trained-ternary-quantization-master/utils/training.pyc in t…
-
使用 paddle 1.4.1 进行分布式训练, fluid Embedding 接口设置 is_distributed=True 会导致 Runtime Error:
fluid.layers.embedding(is_sparse=True,` is_distributed=True)
Error 信息入下
File "train_dist.py", line 20…
-
I started with two files to understand your approach: Preprocessing of the GRADS SARC PBMC data and PCA of the GRADS PBMC baseline expression data. I have not yet seen a file that describes your appro…
-
Thought I'd report an issue, provided below, which I came across while running the shap explainer, concerning reduce_max() function not expecting keyword 'keepdims'.
According to https://github.com…
-
https://github.com/microsoft/DeepSpeed/blob/80f94c10c552ec79473775adb8902b210656ed76/deepspeed/runtime/engine.py#L1384
I wonder why we cannot use overlap_comm in zero1 to reduce more latency?
Appr…
-
grads = opt.compute_gradients(loss_fn, var_list=[var for var in tf.trainable_variables()])