OpenNMT / CTranslate2

Fast inference engine for Transformer models
https://opennmt.net/CTranslate2
MIT License
3.38k stars 300 forks source link

Train Model Load #1162

Closed syngokhan closed 1 year ago

syngokhan commented 1 year ago

Hello, I wish you good work. Thank you for offering such a library to us. First of all, I apologize for my question, I couldn't see it in the following documents, or I couldn't find it overlooked.

I have a trained model(DialoGPT) new build I trained it with peft and can I run this build with your library? Because I want to speed up the generate part.

If you could help with this situation, I would greatly appreciate it. Good work again.

guillaumekln commented 1 year ago

Hi,

Did you already try using the conversion script ct2-transformers-converter? If yes, did you encounter any errors?

syngokhan commented 1 year ago

Hello @guillaumekln , thank you for your feedback. I applied the following methods, but I got the results without any errors. As I mentioned at the beginning, it didn't load the model properly because I trained it with PEFT. I will give you the filenames that come out when trained with peft. Because as you know, as the sizes of the models get bigger, going to training with PEFT solves the problem as hardware. But it is very sad not to use the library you made because it is exactly the solution I want. Like other Marians, their use helped me a lot. Will there be a solution for this in the future? Or an update? Do you think ahead? Thank you very much again, it solved my general problems, I hope you find a solution within the trained PEFT models

/content/PEFT/V_1.0 ├── adapter_config.json PEFT extract ├── adapter_model.bin PEFT extract ├── config.json ├── eval_results.txt ├── flax_model.msgpack ├── generation_config_for_conversational.json ├── generation_config.json ├── gpt2_cached_lm_512 ├── merges.txt ├── pytorch_model.bin ├── README.md ├── special_tokens_map.json ├── tf_model.h5 ├── tokenizer_config.json ├── tokenizer.json ├── training_args.bin └── vocab.json


PEFT Load example ...

from peft import PeftModel, PeftConfig

peft_model_id = "/content/PEFT/V_1.0/"

config = PeftConfig.from_pretrained(peft_model_id)

tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

model = AutoModelWithLMHead.from_pretrained(
    config.base_model_name_or_path, 
    torchscript = True,
    #load_in_8bit=True,
    #device_map = "auto",
    use_cache = False
)

device = "cuda:0" if torch.cuda.is_available() else "cpu"
model = PeftModel.from_pretrained(model, peft_model_id)

if torch.__version__.startswith("2.0"): 
    print("Model Compile Using...")   
    model_comp = torch.compile(model)

------
Ctranslate Save and Load

import ctranslate2
import transformers

if __name__ == "__main__":

    path = "/home/glb90092345/DialoGPT/TrainModel/V_3.5_NoPunctationDataPeft/"  
    save_path = "/home/glb90092345/TEST"

    #t = ctranslate2.converters.TransformersConverter(path)
    #t.convert(save_path)

    generator = ctranslate2.Generator(save_path, device = "cuda")
    tokenizer = transformers.AutoTokenizer.from_pretrained(path)

    # Unconditional generation.
    start_tokens = [tokenizer.bos_token]
    results = generator.generate_batch([start_tokens], max_length=30, sampling_topk=10)
    print(tokenizer.decode(results[0].sequences_ids[0]))

    print("-")
    # Conditional generation.
    start_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode("Hello How"))
    results = generator.generate_batch([start_tokens], max_length=110, sampling_topk=10)
    print(tokenizer.decode(results[0].sequences_ids[0]))
guillaumekln commented 1 year ago

I'm still not very familiar with PEFT but I think you should look to merge the adapter weights and save a full model. See for example https://github.com/huggingface/peft/issues/308. Then you should be able to convert the full model.

guillaumekln commented 1 year ago

So currently the adapters should be merged in the base model before converting to CTranslate2.

I'm closing this issue in favor of #1186 to directly support the adapters.