-
Hi, I saw there is a implementation of gradient checkpointing for TF1 code. do you have a plan to support it for tf2/keras, i think this is a useful feature. BTW, it's great if you also support Accumu…
-
According to [patch_batchnorm](https://github.com/facebookresearch/fairscale/blob/main/fairscale/nn/checkpoint/checkpoint_utils.py#L13-L50) source code if layer collecting running stats (e.g. BatchNor…
-
-
I am trying to finetune Qwen-2.5 Coder-7B-Instruct on my custom dataset but am getting the following error:
``
ValueError: Unsloth: Untrained tokens of [[]] found, but embed_tokens & lm_head not t…
-
We are trying to use a LongFormer and Bert model for multi-label classification of different documents.
When we use the BERT model (BertForSequenceClassification) with max length 512 (batch size 8…
-
Some basic example code using LLama3 from 4bit from Unsloth HF repos:
```
model = FastLanguageModel.get_peft_model(
model,
r = 32,
target_modules = ["q_proj", "k_proj", "v_proj", "…
-
-
[Errno 2] No such file or directory: '../dataset/ReC/mdetr/OpenSource/finetune_refall_train.json' when i run command `accelerate launch --mixed_precision="fp16" --gpu_ids='all' --multi_gpu --main_pro…
-
` self.critic.gradient_checkpointing_enable()
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 454, in __getattr__
raise AttributeError(f"'{type(self).__name_…
-
I tried to implement Multi-Backend Gradient Checkpointing in https://github.com/pass-lin/bert4keras3
But I encounter some problems, such as when I implement in the tf backend
```python
class Scal…