ylsung / Ladder-Side-Tuning

PyTorch codes for "LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning"
MIT License
230 stars 10 forks source link

Performance in Bert #2

Closed jongjyh closed 2 years ago

jongjyh commented 2 years ago

Hi, The work is great! Is there any GLUE experiments result on Bert? I reproduced LST Bert in MRPC getting accuracy of 75%, is it normal? Thanks! :)

ylsung commented 2 years ago

Hi,

I haven't run experiments on BERT before. Do you have the performance of the BERT model on MRPC with full fine-tuning (updating all parameters)? Also, parameter-efficient training usually needs larger learning rates compared to full fine-tuning. Using 10x larger learning rate is usually a good starting point.

jongjyh commented 2 years ago

BERT usually got 85% acc higher or more on MRPC

jongjyh commented 2 years ago

Hey, I also feel curious about the Memory Usage. I reproduced your exp of T5-base on MRPC, founding 8.8 GB of Memory Usage according to the command nvidia-smi. Whereas the papers shows only needing 5.5GB, Could you explain it? Thanks!

ylsung commented 2 years ago

Then maybe try a larger learning rate first.

The memory cost may vary depends on the model architecture. What's the memory usage of full fine-tuning on BERT? It should use fewer memory of that.

Also, tell me more details and concrete ideas if you want me to help, like what you have tried and what you suspect could be issues. It's hard to give useful suggestions if I only know the final accuracy and memory cost.

ylsung commented 2 years ago

Actually in Table 1, the T5-base + LST is trained with dropping 3 layers each in the encoder and decoder, making the memory reduce from 7GB (without dropping) to 5.5GB.

Reference: In the page 7 of the paper, We drop 6 layers (3 layers each in side encoder and decoder) for LST to match the parameter usage of the Adapter and LoRA.

jongjyh commented 2 years ago

Thank you, I think I made a mistake! everything is just alright.

ylsung commented 2 years ago

It's good to know!

BaohaoLiao commented 1 year ago

How do you monitor memory usage? Could you offer the script?