v3ucn / Modelscope_Faster_Whisper_Multi_Subtitle

基于Faster-whisper和modelscope一键生成双语字幕,双语字幕生成器,基于离线大模型,Generate bilingual subtitles with one click based on Faster-whisper and modelscope. Off-line large model
MIT License
204 stars 23 forks source link

发现几个问题哈 #2

Open hyhuc0079 opened 8 months ago

hyhuc0079 commented 8 months ago

1.好像即便不选择提取音频也会在项目目录下生成audio和output两个音频文件 2.模型文件是不是可以加个配置字段可以定义路径比较好,现在的版本只能把缓存地址指向本地,如果可以的话最好能设置模型路径 3.希望源文件可以支持批量或者目录 4.在加载字幕到原视频的时候是调用cpu的,虽然也挺快的但是是不是用gpu更好一点

5.翻译报错了 2024-01-29 17:49:00,706 - modelscope - INFO - initiate model from ./models_from_modelscope\damo\nlp_csanmt_translation_en2zh 2024-01-29 17:49:00,706 - modelscope - INFO - initiate model from location ./models_from_modelscope\damo\nlp_csanmt_translation_en2zh. 2024-01-29 17:49:00,707 - modelscope - INFO - initialize model from ./models_from_modelscope\damo\nlp_csanmt_translation_en2zh {'hidden_size': 1024, 'filter_size': 4096, 'num_heads': 16, 'num_encoder_layers': 24, 'num_decoder_layers': 6, 'attention_dropout': 0, 'residual_dropout': 0, 'relu_dropout': 0, 'layer_preproc': 'layer_norm', 'layer_postproc': 'none', 'shared_embedding_and_softmax_weights': True, 'shared_source_target_embedding': True, 'initializer_scale': 0.1, 'position_info_type': 'absolute', 'max_relative_dis': 16, 'num_semantic_encoder_layers': 4, 'src_vocab_size': 50000, 'trg_vocab_size': 50000, 'seed': 1234, 'beam_size': 4, 'lp_rate': 0.6, 'max_decoded_trg_len': 100, 'device_map': None, 'device': 'cuda'} 2024-01-29 17:49:00,711 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2024-01-29 17:49:00,711 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'src_lang': 'en', 'tgt_lang': 'zh', 'src_bpe': {'file': 'bpe.en'}, 'model_dir': './models_from_modelscope\damo\nlp_csanmt_translation_en2zh'}. trying to build by task and model information. 2024-01-29 17:49:00,711 - modelscope - WARNING - No preprocessor key ('csanmt-translation', 'translation') found in PREPROCESSOR_MAP, skip building preprocessor. Traceback (most recent call last): File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\modelscope\utils\registry.py", line 212, in build_from_cfg return obj_cls(**args) File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\modelscope\pipelines\nlp\translation_pipeline.py", line 54, in init self._src_vocab = dict([ File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\modelscope\pipelines\nlp\translation_pipeline.py", line 54, in self._src_vocab = dict([ UnicodeDecodeError: 'gbk' codec can't decode byte 0x9a in position 2053: illegal multibyte sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\gradio\routes.py", line 534, in predict output = await route_utils.call_process_api( File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\gradio\route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\gradio\blocks.py", line 1550, in process_api result = await self.call_function( File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\gradio\blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread return await future File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, args) File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(args, **kwargs) File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\v1\app.py", line 40, in do_trans_en2zh return make_tran() File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\v1\utils.py", line 45, in make_tran pipeline_ins = pipeline(task=Tasks.translation, model=model_dir_ins) File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\modelscope\pipelines\builder.py", line 170, in pipeline return build_pipeline(cfg, task_name=task) File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\modelscope\pipelines\builder.py", line 65, in build_pipeline return build_from_cfg( File "D:\ai2vedio\Modelscope_Faster_Whisper_Multi_Subtitle\venv\lib\site-packages\modelscope\utils\registry.py", line 215, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: function takes exactly 5 arguments (1 given)

好像是字符编码的问题

v3ucn commented 8 months ago

好的,编码问题咋复现?

hyhuc0079 commented 8 months ago

我用git版随便转写一个英文视频翻译都会报错,我还以为你是Linux系统默认文件编码不一样呢?你试没报错嘛?

hyhuc0079 commented 8 months ago

比较不理解的是即便我把你生成的srt手工转一下码也依然会报错,我还把字幕文件删的只剩一行排除有特殊字符

v3ucn commented 8 months ago

好的,感谢,我试试

v3ucn commented 8 months ago

提交了新的代码