-
I ran the code with:
```
python cli.py \
--method ipet \
--data_dir ../dataset/data \
--model_type bert \
--model_name_or_path bert-base-cased \
--task_name my-task \
--output_dir…
-
Inverted Jacobian products are useful in a variety of algorithms such as the efficient implementation of [Newton's method with regularization](https://math.stackexchange.com/questions/3287587/extracti…
-
Exciting paper! Thank you for doing this research and publishing it.
Do you want to share some insight on what type of compute is required for training LaVi-Bridge?
Since you've used around 2M t…
-
LLaMA-13B (HF) Fails with OOM on a dual A100-80GB.
For those who managed to run alpaca against the 13b model, what specs and torchun setting did you use?
`torchrun --nproc_per_node=2 --master_po…
-
Training the UNet...
'########:'########:::::'###::::'####:'##::: ##:'####:'##::: ##::'######:::
... ##..:: ##.... ##:::'## ##:::. ##:: ###:: ##:. ##:: ###:: ##:'##... ##::
::: ##:::: ##:::: ##::'#…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
python main.py \
--do_train \
--train_file AdvertiseGen/train.json \
--validat…
-
## 📚 Documentation
Documentation for `scatter_` and `scatter_add_` incorrectly states that "Moreover, as for gather(), the values of index must be between 0 and self.size(dim) - 1 inclusive, and a…
-
-
I want to do only training (lora is fine) for the head of the network, how do I do that? I get this error:
```bash
(beyond_scale_2_unsloth) brando9@ampere1~/beyond-scale-2-alignment-coeff $ python /…
-
Is there a way to enable zero3-offload for LLaMA-VID?
I'm trying to integrate a LLM with higher GPU RAM usage to LLaMA-VID, which means I can't run it without offloading to RAM, even at batch_size=…