The training time of "get unlearned model"

lucasliunju commented 1 week ago

Thank you for your great work.

I find you use 8 gpus to train simnpo on TPFU. I would like to ask the training time for that process. For example, the training time with 8 A100 gpus. And then I can estimate the time with one A100 GPU.

Thank you very much!

a-F1 commented 1 week ago

Thank you for your interest and attention to our work. You can find the details of the runtime in Appendix C.2.

We recommend using the same number of GPUs to replicate the experimental results. We have found that due to the impact of acceleration frameworks such as DeepSpeed and FlashAttention, it is important to ensure that both the fine-tuning of the origin model and the unlearning use the same number of GPUs to maintain experimental fairness.

If you prefer to use a single A100 GPU for replication, we suggest fine-tuning the origin model on one A100 first, and then performing the unlearning on your origin model using one A100. Please note that β and γ will need to be slightly adjusted during the unlearning process to account for the change in the number of GPUs.

If you have any further questions, feel free to reach out to us.

lucasliunju commented 1 week ago

Thank you very much for your quick reply! I will check it.

lucasliunju commented 1 week ago

Hi @a-F1

This is my current results with 2 A100 gpus. I only change the accumulation steps in the config file forget.yaml to make sure the global batch size is 32:

batch_size: 1
gradient_accumulation_steps: 16

The result is:

Real Authors ROUGE: 0.8905
Real Authors Probability: 0.44337029001214334
Real Authors Truth Ratio: 0.5869648701803778
Real Authors KL Divergence: 0.02640758311841637
Real World ROUGE: 0.8703703703703705
Real World Probability: 0.4040287482374864
Real World Truth Ratio: 0.5474826794356535
Real World KL Divergence: 0.011739911713525971
Retain ROUGE: 0.6075724302248215
Retain Probability: 0.7919848002764175
Retain Truth Ratio: 0.4517929828613795
Retain KL Divergence: 0.09222812355185549
Forget ROUGE: 0.425975184064288
Forget Probability: 0.12263476647740007
Forget Truth Ratio: 0.5518770455797077
Forget KL Divergence: 0.6092684630025178
Model Utility: 0.57491116522813
Forget Quality: 1.8266119303942767e-05
KS Test PVal Forget: 1.8266119303942767e-05
KS Test Forget: 0.24
curr_step: 62
seed: 1001
loss_type: simnpo_grad_diff

a-F1 commented 6 days ago

Thank you for sharing the results with us. Could you let us know how you obtained the origin model?

lucasliunju commented 6 days ago

Hi @a-F1 ,

I try to download the model from provided link: https://drive.google.com/drive/folders/1L47Hf813gal8RD581S3XrWHnY_0ll4y4?usp=sharing

lucasliunju commented 6 days ago

And I try to change the model_path in config file from model_path: paper_models/final_ft_noLORA_5_epochs_inst_lr1e-05_llama2-7b_full_seed42_1/checkpoint-625 to the provided model path.

a-F1 commented 6 days ago

I understand. If you prefer to use two A100 GPUs for replication, we suggest fine-tuning the original model on two A100 GPUs first, and then performing the unlearning on the original model using the same two GPUs.

We strongly recommend using the same number of GPUs as in the original experiment to replicate the results accurately, given the potential impact of acceleration frameworks such as DeepSpeed and FlashAttention.

lucasliunju commented 5 days ago

Hi @a-F1, Thanks for your suggestion and I will try it.

lucasliunju commented 4 days ago

Hi @a-F1 I try to use 2 A100 gpus to fine-tune the original model and then run the unlearning code. The result is:

Real Authors ROUGE: 0.8815000000000001
Real Authors Probability: 0.4271009731647622
Real Authors Truth Ratio: 0.5753611728840731
Real Authors KL Divergence: 0.03387782997917384
Real World ROUGE: 0.8632478632478633
Real World Probability: 0.40175375682768294
Real World Truth Ratio: 0.5290871180286809
Real World KL Divergence: 0.010795693080394696
Retain ROUGE: 0.6265936483189299
Retain Probability: 0.8099792384465746
Retain Truth Ratio: 0.45218285621405985
Retain KL Divergence: 0.07962153427302837
Forget ROUGE: 0.4285872906798762
Forget Probability: 0.13636983285226725
Forget Truth Ratio: 0.5807983278912466
Forget KL Divergence: 0.5607606801600196
Model Utility: 0.569857901774517
Forget Quality: 4.7487878961137165e-05
KS Test PVal Forget: 4.7487878961137165e-05
KS Test Forget: 0.23
curr_step: 62
seed: 1001
loss_type: simnpo_grad_diff

a-F1 commented 4 days ago

Thank you so much. Could you provide more details? For example, the versions of CUDA, DeepSpeed, and FlashAttention that you're using?

lucasliunju commented 4 days ago

Yes, this is my current env: absl-py 2.1.0 accelerate 1.0.0 aiohappyeyeballs 2.4.3 aiohttp 3.10.9 aiosignal 1.3.1 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 async-timeout 4.0.3 attrs 24.2.0 bitsandbytes 0.44.1 certifi 2024.8.30 charset-normalizer 3.4.0 click 8.1.7 contourpy 1.3.0 cycler 0.12.1 datasets 3.0.1 deepspeed 0.15.2 dill 0.3.8 einops 0.8.0 evaluate 0.4.3 filelock 3.13.1 flash-attn 2.6.3 fonttools 4.54.1 frozenlist 1.4.1 fsspec 2024.2.0 hjson 3.1.0 huggingface-hub 0.25.2 hydra-core 1.3.2 idna 3.10 Jinja2 3.1.3 joblib 1.4.2 kiwisolver 1.4.7 MarkupSafe 2.1.5 matplotlib 3.9.2 mpmath 1.3.0 msgpack 1.1.0 multidict 6.1.0 multiprocess 0.70.16 networkx 3.2.1 ninja 1.11.1.1 nltk 3.9.1 numpy 1.26.3 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.1.105 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 packaging 24.1 pandas 2.2.3 peft 0.13.1 pillow 10.2.0 pip 23.0.1 propcache 0.2.0 psutil 6.0.0 py-cpuinfo 9.0.0 pyarrow 17.0.0 pydantic 2.9.2 pydantic_core 2.23.4 pyparsing 3.1.4 python-dateutil 2.9.0.post0 pytz 2024.2 PyYAML 6.0.2 regex 2024.9.11 requests 2.32.3 rouge_score 0.1.2 safetensors 0.4.5 scipy 1.14.1 setuptools 65.5.0 six 1.16.0 sympy 1.12 tokenizers 0.20.1 torch 2.3.0+cu121 torchaudio 2.3.0+cu121 torchvision 0.18.0+cu121 tqdm 4.66.5 transformers 4.46.0.dev0 triton 2.3.0 typing_extensions 4.9.0 tzdata 2024.2 urllib3 2.2.3 wheel 0.44.0 xxhash 3.5.0 yarl 1.14.0

lucasliunju commented 4 days ago

By the way, I also try to run npo with this code and the result is also not very well.

a-F1 commented 4 days ago

That's quite strange, as we haven't made any modifications to the original NPO code. Here are the specific versions used in our environment. I noticed that the versions of CUDA, DeepSpeed, and FlashAttention are different.

absl-py=2.1.0=pypi_0
accelerate=0.32.1=pypi_0
aiohttp=3.9.5=pypi_0
aiosignal=1.3.1=pypi_0
annotated-types=0.7.0=pypi_0
antlr4-python3-runtime=4.9.3=pypi_0
asttokens=2.4.1=pyhd8ed1ab_0
async-timeout=4.0.3=pypi_0
attrs=23.2.0=pypi_0
badam=1.2.2=pypi_0
bitsandbytes=0.43.1=pypi_0
blas=1.0=mkl
bzip2=1.0.8=h5eee18b_6
ca-certificates=2024.8.30=hbcca054_0
certifi=2024.7.4=pypi_0
chardet=5.2.0=pypi_0
charset-normalizer=3.3.2=pypi_0
click=8.1.7=pypi_0
colorama=0.4.6=pypi_0
comm=0.2.2=pyhd8ed1ab_0
contourpy=1.2.1=pypi_0
cuda-cccl=11.8.89=0
cuda-command-line-tools=11.8.0=0
cuda-compiler=11.8.0=0
cuda-cudart=11.8.89=0
cuda-cudart-dev=11.8.89=0
cuda-cuobjdump=11.8.86=0
cuda-cupti=11.8.87=0
cuda-cuxxfilt=11.8.86=0
cuda-documentation=11.8.86=0
cuda-driver-dev=11.8.89=0
cuda-gdb=11.8.86=0
cuda-libraries=11.8.0=0
cuda-libraries-dev=11.8.0=0
cuda-memcheck=11.8.86=0
cuda-nsight=11.8.86=0
cuda-nsight-compute=11.8.0=0
cuda-nvcc=11.8.89=0
cuda-nvdisasm=11.8.86=0
cuda-nvml-dev=11.8.86=0
cuda-nvprof=11.8.87=0
cuda-nvprune=11.8.86=0
cuda-nvrtc=11.8.89=0
cuda-nvrtc-dev=11.8.89=0
cuda-nvtx=11.8.86=0
cuda-nvvp=11.8.87=0
cuda-profiler-api=11.8.86=0
cuda-runtime=11.8.0=0
cuda-sanitizer-api=11.8.86=0
cuda-toolkit=11.8.0=0
cuda-tools=11.8.0=0
cuda-version=12.5=3
cuda-visual-tools=11.8.0=0
cycler=0.12.1=pypi_0
dataproperty=1.0.1=pypi_0
datasets=2.20.0=pypi_0
debugpy=1.6.7=pypi_0
decorator=5.1.1=pyhd8ed1ab_0
deepspeed=0.14.4=pypi_0
dill=0.3.8=pypi_0
einops=0.8.0=pypi_0
entrypoints=0.4=pyhd8ed1ab_0
evaluate=0.4.2=pypi_0
exceptiongroup=1.2.2=pyhd8ed1ab_0
executing=2.0.1=pyhd8ed1ab_0
filelock=3.13.1=pypi_0
flash-attn=2.5.9.post1=pypi_0
fonttools=4.53.1=pypi_0
frozenlist=1.4.1=pypi_0
fsspec=2024.5.0=pypi_0
gds-tools=1.4.0.31=0
git-filter-repo=2.45.0=pypi_0
gmp=6.2.1=h295c915_3
gmpy2=2.1.2=pypi_0
hjson=3.1.0=pypi_0
huggingface-hub=0.23.4=pypi_0
hydra-core=1.3.2=pypi_0
idna=3.7=pypi_0
intel-openmp=2023.1.0=hdb19cb5_46306
ipykernel=6.29.5=pyh3099207_0
ipython=8.26.0=pyh707e725_0
jedi=0.19.1=pyhd8ed1ab_0
jinja2=3.1.4=pypi_0
joblib=1.4.2=pypi_0
jsonlines=4.0.0=pypi_0
jupyter-core=5.7.2=pypi_0
jupyter_client=7.3.4=pyhd8ed1ab_0
jupyter_core=5.7.2=py310hff52083_0
kiwisolver=1.4.5=pypi_0
ld_impl_linux-64=2.38=h1181459_1
libcublas=11.11.3.6=0
libcublas-dev=11.11.3.6=0
libcufft=10.9.0.58=0
libcufft-dev=10.9.0.58=0
libcufile=1.10.1.7=0
libcufile-dev=1.4.0.31=0
libcurand=10.3.6.82=0
libcurand-dev=10.3.0.86=0
libcusolver=11.4.1.48=0
libcusolver-dev=11.4.1.48=0
libcusparse=11.7.5.86=0
libcusparse-dev=11.7.5.86=0
libffi=3.4.4=h6a678d5_1
libgcc-ng=11.2.0=h1234567_1
libgomp=11.2.0=h1234567_1
libnpp=11.8.0.86=0
libnpp-dev=11.8.0.86=0
libnvjpeg=11.9.0.86=0
libnvjpeg-dev=11.9.0.86=0
libsodium=1.0.18=h36c2ea0_1
libstdcxx-ng=11.2.0=h1234567_1
libuuid=1.41.5=h5eee18b_0
llvm-openmp=14.0.6=h9e868ea_0
lm-eval=0.4.3=pypi_0
lxml=5.3.0=pypi_0
markupsafe=2.1.3=pypi_0
matplotlib=3.9.1=pypi_0
matplotlib-inline=0.1.7=pyhd8ed1ab_0
mbstrdecoder=1.1.3=pypi_0
mkl=2023.1.0=h213fc3f_46344
more-itertools=10.4.0=pypi_0
mpc=1.1.0=h10f8cd9_1
mpfr=4.0.2=hb69a4c5_1
mpmath=1.3.0=pypi_0
multidict=6.0.5=pypi_0
multiprocess=0.70.16=pypi_0
ncurses=6.4=h6a678d5_0
nest-asyncio=1.6.0=pyhd8ed1ab_0
networkx=3.3=pypi_0
ninja=1.11.1.1=pypi_0
nltk=3.8.1=pypi_0
nsight-compute=2022.3.0.22=0
numexpr=2.10.1=pypi_0
numpy=1.26.4=pypi_0
nvidia-ml-py=12.555.43=pypi_0
omegaconf=2.3.0=pypi_0
openssl=3.0.14=h5eee18b_0
packaging=24.1=pyhd8ed1ab_0
pandas=2.2.2=pypi_0
parso=0.8.4=pyhd8ed1ab_0
pathvalidate=3.2.1=pypi_0
pdf2image=1.17.0=pypi_0
peft=0.11.1=pypi_0
pexpect=4.9.0=pyhd8ed1ab_0
pickleshare=0.7.5=py_1003
pillow=10.4.0=pypi_0
pip=24.0=pypi_0
platformdirs=4.2.2=pyhd8ed1ab_0
portalocker=2.10.1=pypi_0
prompt-toolkit=3.0.47=pyha770c72_0
psutil=6.0.0=pypi_0
ptyprocess=0.7.0=pyhd3deb0d_0
pure_eval=0.2.3=pyhd8ed1ab_0
py-cpuinfo=9.0.0=pypi_0
pyarrow=16.1.0=pypi_0
pyarrow-hotfix=0.6=pypi_0
pybind11=2.13.5=pypi_0
pydantic=2.8.2=pypi_0
pydantic-core=2.20.1=pypi_0
pygments=2.18.0=pyhd8ed1ab_0
pyparsing=3.1.2=pypi_0
pytablewriter=1.2.0=pypi_0
python=3.10.14=h955ad1f_1
python-dateutil=2.9.0.post0=pypi_0
python_abi=3.10=2_cp310
pytorch=2.3.1=py3.10_cuda11.8_cudnn8.7.0_0
pytorch-cuda=11.8=h7e8668a_5
pytorch-mutex=1.0=cuda
pytz=2024.1=pypi_0
pyyaml=6.0.1=pypi_0
pyzmq=25.1.2=pypi_0
readline=8.2=h5eee18b_0
regex=2024.5.15=pypi_0
reportlab=4.2.2=pypi_0
requests=2.32.3=pypi_0
rouge-score=0.1.2=pypi_0
sacrebleu=2.4.3=pypi_0
safetensors=0.4.3=pypi_0
scikit-learn=1.5.1=pypi_0
scipy=1.14.0=pypi_0
seaborn=0.13.2=pypi_0
setuptools=69.5.1=pypi_0
six=1.16.0=pyh6c4a22f_0
sqlite=3.45.3=h5eee18b_0
sqlitedict=2.1.0=pypi_0
stack_data=0.6.2=pyhd8ed1ab_0
sympy=1.12=pypi_0
tabledata=1.3.3=pypi_0
tabulate=0.9.0=pypi_0
tbb=2021.8.0=hdb19cb5_0
tcolorpy=0.1.6=pypi_0
threadpoolctl=3.5.0=pypi_0
tk=8.6.14=h39e8969_0
tokenizers=0.19.1=pypi_0
torch=2.3.1=pypi_0
torchtriton=2.3.1=py310
tornado=6.1=pypi_0
tqdm=4.66.4=pypi_0
tqdm-multiprocess=0.0.11=pypi_0
traitlets=5.14.3=pyhd8ed1ab_0
transformers=4.43.0.dev0=pypi_0
triton=2.3.1=pypi_0
typepy=1.3.2=pypi_0
typing-extensions=4.11.0=pypi_0
typing_extensions=4.11.0=py310h06a4308_0
tzdata=2024.1=pypi_0
urllib3=2.2.2=pypi_0
wcwidth=0.2.13=pyhd8ed1ab_0
wheel=0.43.0=pypi_0
word2number=1.1=pypi_0
xxhash=3.4.1=pypi_0
xz=5.4.6=h5eee18b_1
yaml=0.2.5=h7b6447c_0
yarl=1.9.4=pypi_0
zeromq=4.3.5=h6a678d5_0
zlib=1.2.13=h5eee18b_1
zstandard=0.23.0=pypi_0

lucasliunju commented 4 days ago

Thanks for your reply. I will have a try.

OPTML-Group / Unlearn-Simple

The training time of "get unlearned model" #1