Some weights of BertLMHeadModel were not initialized from the model checkpoint at bert-base-uncased and are newly initialized

commend I run !python anygpt/src/infer/cli_infer_base_model.py \ --model-name-or-path AnyGPT-base \ --image-tokenizer-path models/seed-tokenizer-2/seed_quantizer.pt \ --speech-tokenizer-path models/speechtokenizer/ckpt.dev \ --speech-tokenizer-config models/speechtokenizer/config.json \ --soundstorm-path models/soundstorm/speechtokenizer_soundstorm_mls.pt \ --output-dir "infer_output/base"

Below is the error

NeMo-text-processing :: INFO :: Creating ClassifyFst grammars. Using device: cuda loading image tokenzier /home//.cache/torch/hub/checkpoints/eva_vit_g.pth INFO:root:freeze vision encoder Some weights of BertLMHeadModel were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['bert.encoder.layer.11.output_query.dense.weight', 'bert.encoder.layer.0.crossattention.self.value.weight', 'bert.encoder.layer.5.output_query.LayerNorm.bias', 'bert.encoder.layer.8.output_query.LayerNorm.weight', 'bert.encoder.layer.2.crossattention.self.query.weight', 'bert.encoder.layer.10.crossattention.output.dense.bias', 'bert.encoder.layer.5.output_query.dense.weight', 'bert.encoder.layer.2.output_query.LayerNorm.weight', 'bert.encoder.layer.7.output_query.LayerNorm.bias', 'bert.encoder.layer.7.intermediate_query.dense.bias', 'bert.encoder.layer.6.output_query.LayerNorm.bias', 'bert.encoder.layer.11.output_query.dense.bias', 'bert.encoder.layer.1.intermediate_query.dense.bias', 'bert.encoder.layer.6.output_query.dense.bias', 'bert.encoder.layer.9.intermediate_query.dense.bias', 'bert.encoder.layer.11.intermediate_query.dense.weight', 'bert.encoder.layer.6.crossattention.output.dense.weight', 'bert.encoder.layer.3.output_query.LayerNorm.bias', 'bert.encoder.layer.8.crossattention.self.key.weight', 'bert.encoder.layer.0.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.2.output_query.dense.weight', 'bert.encoder.layer.0.crossattention.self.key.bias', 'bert.encoder.layer.6.crossattention.self.query.weight', 'bert.encoder.layer.8.crossattention.self.value.weight', 'bert.encoder.layer.8.crossattention.output.dense.weight', 'bert.encoder.layer.8.crossattention.output.dense.bias', 'bert.encoder.layer.10.output_query.LayerNorm.weight', 'bert.encoder.layer.10.output_query.dense.weight', 'bert.encoder.layer.6.crossattention.self.query.bias', 'bert.encoder.layer.6.output_query.LayerNorm.weight', 'bert.encoder.layer.6.crossattention.self.value.bias', 'bert.encoder.layer.2.crossattention.self.value.weight', 'bert.encoder.layer.8.intermediate_query.dense.weight', 'bert.encoder.layer.2.output_query.LayerNorm.bias', 'bert.encoder.layer.6.crossattention.output.dense.bias', 'bert.encoder.layer.4.intermediate_query.dense.bias', 'bert.encoder.layer.10.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.2.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.1.intermediate_query.dense.weight', 'bert.encoder.layer.4.crossattention.self.key.weight', 'bert.encoder.layer.2.crossattention.self.query.bias', 'bert.encoder.layer.7.intermediate_query.dense.weight', 'bert.encoder.layer.10.crossattention.self.query.weight', 'bert.encoder.layer.9.intermediate_query.dense.weight', 'bert.encoder.layer.6.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.9.output_query.LayerNorm.bias', 'bert.encoder.layer.3.intermediate_query.dense.weight', 'bert.encoder.layer.0.crossattention.self.query.weight', 'bert.encoder.layer.0.crossattention.self.value.bias', 'bert.encoder.layer.8.output_query.LayerNorm.bias', 'bert.encoder.layer.4.output_query.dense.bias', 'bert.encoder.layer.2.crossattention.self.key.bias', 'bert.encoder.layer.1.output_query.LayerNorm.bias', 'bert.encoder.layer.4.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.6.crossattention.self.value.weight', 'bert.encoder.layer.4.crossattention.self.value.weight', 'bert.encoder.layer.0.output_query.LayerNorm.bias', 'bert.encoder.layer.9.output_query.LayerNorm.weight', 'bert.encoder.layer.4.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.0.crossattention.output.dense.weight', 'bert.encoder.layer.7.output_query.LayerNorm.weight', 'bert.encoder.layer.8.crossattention.self.key.bias', 'bert.encoder.layer.8.output_query.dense.bias', 'bert.encoder.layer.0.intermediate_query.dense.weight', 'bert.encoder.layer.2.intermediate_query.dense.weight', 'bert.encoder.layer.0.crossattention.output.dense.bias', 'bert.encoder.layer.0.crossattention.self.query.bias', 'bert.encoder.layer.3.output_query.LayerNorm.weight', 'bert.encoder.layer.6.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.10.crossattention.self.value.weight', 'bert.encoder.layer.2.crossattention.self.value.bias', 'bert.encoder.layer.11.output_query.LayerNorm.bias', 'bert.encoder.layer.6.crossattention.self.key.weight', 'bert.encoder.layer.4.crossattention.self.key.bias', 'bert.encoder.layer.0.output_query.dense.weight', 'bert.encoder.layer.4.crossattention.self.query.weight', 'bert.encoder.layer.6.crossattention.self.key.bias', 'bert.encoder.layer.5.intermediate_query.dense.weight', 'bert.encoder.layer.1.output_query.dense.weight', 'bert.encoder.layer.5.output_query.LayerNorm.weight', 'bert.encoder.layer.2.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.9.output_query.dense.weight', 'bert.encoder.layer.4.crossattention.self.query.bias', 'bert.encoder.layer.11.intermediate_query.dense.bias', 'bert.encoder.layer.6.output_query.dense.weight', 'bert.encoder.layer.5.output_query.dense.bias', 'bert.encoder.layer.6.intermediate_query.dense.weight', 'bert.encoder.layer.2.output_query.dense.bias', 'bert.encoder.layer.8.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.4.crossattention.self.value.bias', 'bert.encoder.layer.1.output_query.LayerNorm.weight', 'bert.encoder.layer.10.output_query.LayerNorm.bias', 'bert.encoder.layer.3.output_query.dense.weight', 'bert.encoder.layer.4.output_query.LayerNorm.weight', 'bert.encoder.layer.8.crossattention.self.value.bias', 'bert.encoder.layer.8.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.10.output_query.dense.bias', 'bert.encoder.layer.8.crossattention.self.query.bias', 'bert.encoder.layer.4.intermediate_query.dense.weight', 'bert.encoder.layer.0.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.0.crossattention.self.key.weight', 'bert.encoder.layer.10.crossattention.self.key.bias', 'bert.encoder.layer.0.output_query.dense.bias', 'bert.encoder.layer.4.output_query.LayerNorm.bias', 'bert.encoder.layer.3.output_query.dense.bias', 'bert.encoder.layer.7.output_query.dense.bias', 'bert.encoder.layer.3.intermediate_query.dense.bias', 'bert.encoder.layer.1.output_query.dense.bias', 'bert.encoder.layer.4.output_query.dense.weight', 'bert.encoder.layer.10.crossattention.output.dense.weight', 'bert.encoder.layer.8.crossattention.self.query.weight', 'bert.encoder.layer.10.crossattention.self.query.bias', 'bert.encoder.layer.9.output_query.dense.bias', 'bert.encoder.layer.4.crossattention.output.dense.bias', 'bert.encoder.layer.7.output_query.dense.weight', 'bert.encoder.layer.2.intermediate_query.dense.bias', 'bert.encoder.layer.10.crossattention.self.value.bias', 'bert.encoder.layer.10.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.0.output_query.LayerNorm.weight', 'bert.encoder.layer.5.intermediate_query.dense.bias', 'bert.encoder.layer.4.crossattention.output.dense.weight', 'bert.encoder.layer.8.output_query.dense.weight', 'bert.encoder.layer.6.intermediate_query.dense.bias', 'bert.encoder.layer.2.crossattention.output.dense.weight', 'bert.encoder.layer.10.intermediate_query.dense.weight', 'bert.encoder.layer.0.intermediate_query.dense.bias', 'bert.encoder.layer.2.crossattention.output.dense.bias', 'bert.encoder.layer.10.intermediate_query.dense.bias', 'bert.encoder.layer.8.intermediate_query.dense.bias', 'bert.encoder.layer.2.crossattention.self.key.weight', 'bert.encoder.layer.10.crossattention.self.key.weight', 'bert.encoder.layer.11.output_query.LayerNorm.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. missing keys: 511 unexpected keys: 146 oading music tokenizer Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration. loading audio tokenizer Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration. loading llm

OpenMOSS / AnyGPT

Some weights of BertLMHeadModel were not initialized from the model checkpoint at bert-base-uncased and are newly initialized #19

Below is the error