Open fzyzcjy opened 3 days ago
I have confirmed that the same problem occurs. Probably due to https://github.com/vllm-project/llm-compressor/blob/a47137d834a2be8f8fcd49458f121b08ba34e2c9/src/llmcompressor/transformers/finetune/training_args.py#L61, if you specify output_dir
as shown below, it will be saved once.
- oneshot(model=model, recipe=recipe)
- model.save_pretrained(SAVE_DIR)
+ oneshot(model=model, recipe=recipe, output_dir=SAVE_DIR)
tokenizer.save_pretrained(SAVE_DIR)
it will be saved once
Well it seems the save operation is done twice, but write (and overwrite) to the save folder...
Did you delete model.save_pretrained(SAVE_DIR)
?
@kota-iizuka I personally want to keep model.save_pretrained (and tokenizer.save_pretrained) for better fine-control indeed
@fzyzcjy I understand your request, and I agree that it would be nice to have an option to not save the model when running oneshot()
if you don't specify output_dir
(or if you specify any special parameters).
(On the other hand, I personally don't have motivation to fix that since I just want the resulting model...)
Hi @fzyzcjy @kota-iizuka if you pull down the latest main, you can avoid saving twice by not providing an output_dir
to the oneshot call. It will only save in the output_dir if the kwarg is provided or if you provide a string as the model input to the model, not an actual model instance.
Thanks!
Hi thanks for the lib! When checking https://github.com/vllm-project/llm-compressor/issues/935, it seems that
one_shot
auto saves everything to the output folder. That looks great, but if I understand correctly, https://github.com/vllm-project/llm-compressor/blob/a47137d834a2be8f8fcd49458f121b08ba34e2c9/examples/quantization_kv_cache/llama3_fp8_kv_example.py#L99 here we want to manually save. Thus it seems the example script saves everything twice.