0nutation / SpeechGPT

SpeechGPT Series: Speech Large Language Models
https://0nutation.github.io/SpeechGPT.github.io/
Apache License 2.0
1.29k stars 86 forks source link

mandarin support? #8

Open zdj97 opened 1 year ago

zdj97 commented 1 year ago

Does this model support mandarin?

0nutation commented 1 year ago

The current SpeechGPT models are trained exclusively on English speech, so they do not support Mandarin. We plan to address this in future updates.

zdj97 commented 1 year ago

Thanks for your reply. By the way, i try this script in mac m1. However, it does not work succesfully.

/\python speechgpt/src/infer/web_infer.py \ --model-name-or-path "models/speechgpt-7B-cm" \ --lora-weights "models/speechgpt-7B-com" \ --s2u-dir "utils/speech2unit" \ --vocoder-dir "utils/vocoder/" \ --output-dir "output/" NOTE: Redirects are currently not supported in Windows or MacOs. 2023-09-20 10:41:16 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source? CUDA SETUP: Defaulting to libbitsandbytes_cpu.so... dlopen(/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so, 0x0006): tried: '/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file), '/System/Volumes/Preboot/Cryptexes/OS/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (no such file), '/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file) CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source? CUDA SETUP: Defaulting to libbitsandbytes_cpu.so... dlopen(/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so, 0x0006): tried: '/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file), '/System/Volumes/Preboot/Cryptexes/OS/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (no such file), '/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file) /Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " utils/speech2unit/mhubert_base_vp_en_es_fr_it3.pt utils/speech2unit/mhubert_base_vp_en_es_fr_it3_L11_km1000.bin 2023-09-20 10:41:17 | INFO | fairseq.tasks.hubert_pretraining | current directory is x 2023-09-20 10:41:17 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': '/checkpoint/annl/s2st/data/voxpopuli/mHuBERT/en_es_fr', 'fine_tuning': False, 'labels': ['km'], 'label_dir': '/checkpoint/wnhsu/experiments/hubert/kmeans/mhubert_vp_en_es_fr_it2_400k/en_es_fr.layer9.km500', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False} 2023-09-20 10:41:17 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] 4 + [(512,2,2)] 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False} 2023-09-20 10:41:19 | INFO | generate_pseudo_language | TASK CONFIG: {'_name': 'hubert_pretraining', 'data': '/checkpoint/annl/s2st/data/voxpopuli/mHuBERT/en_es_fr', 'fine_tuning': False, 'labels': ['km'], 'label_dir': '/checkpoint/wnhsu/experiments/hubert/kmeans/mhubert_vp_en_es_fr_it2_400k/en_es_fr.layer9.km500', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False} /Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/sklearn/base.py:347: InconsistentVersionWarning: Trying to unpickle estimator MiniBatchKMeans from version 0.24.2 when using version 1.3.0. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations warnings.warn( 2023-09-20 10:43:08 | WARNING | accelerate.utils.modeling | The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function. Traceback (most recent call last): File "speechgpt/src/infer/web_infer.py", line 28, in infer = SpeechGPTInference( File "/Users/zmj1997/Downloads/文档/AIGC/codes/SpeechGPT-main-fudan/speechgpt/src/infer/cli_infer.py", line 70, in init self.model = LlamaForCausalLM.from_pretrained( File "/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3175, in from_pretrained ) = cls._load_pretrained_model( File "/Users/zmj1997/miniforge3/envs/speechgpt/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3296, in _load_pretrained_model raise ValueError( ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format.

huangxu1991 commented 9 months ago

Looking forward to it!