Closed panjianfei closed 1 month ago
The first step would be to have a .nemo version of Qwen1.5. This was just merged into NeMo: https://github.com/NVIDIA/NeMo/pull/9055 ==> you may try the conversion script.
Then hopefully it will work seamlessly in NeMo-Aligner, but since we haven't tested it's quite possible that some issues may come up (I would also be surprised if it worked out-of-the-box with TRT-LLM generation).
thank you, 👍
@odelalleau i try to convert Qwen2 hf to Nemo format with https://github.com/NVIDIA/NeMo/pull/9055https://github.com/NVIDIA/NeMo/pull/9055, and i got something wrong;
f'model.decoder.layers.{l}.self_attention.linear_qkv.bias' is unexcpted keys
, i have to annotate lines https://github.com/NVIDIA/NeMo/blob/main/scripts/checkpoint_converters/convert_qwen2_hf_to_nemo.py#L205-L225
@odelalleau GPTModel does support the attention bias? GPTModel's named_paraters:
model.decoder.layers.23.self_attention.linear_proj.weight
model.decoder.layers.23.self_attention.linear_qkv.layer_norm_weight
model.decoder.layers.23.self_attention.linear_qkv.weight
model.decoder.layers.23.mlp.linear_fc1.layer_norm_weight
model.decoder.layers.23.mlp.linear_fc1.weight
model.decoder.layers.23.mlp.linear_fc2.weight
model.decoder.final_layernorm.weight
model.output_layer.weight```
can you share a pipeline, which is used to fine-tune Qwen1.5 model based on Nemo?