macbert4csc Search Results

shibing624/pycorrector #438

Official demo调用的模型或逻辑

### Describe the Question 请问下这个在线体验区 Official demo，使用的哪个模型或是有什么逻辑？谢谢

tianyongliu updated 10 months ago

shibing624/pycorrector #374

macbert训练后如何转化为onnx模型呢

if ckpt_path and os.path.exists(ckpt_path): model.load_state_dict(torch.load(ckpt_path)['state_dict']) # 先保存原始transformer bert model tokenizer.save_pretrained(cfg.OUTPUT…

lazywangyuan updated 10 months ago

shibing624/pycorrector #362

pycorrector是否支持增量训练

想请问一下大佬，pycorrector是否支持增量训练？readme看了，貌似没有相关的说明。issue太多了，大概翻了一下，没有看到类似的问题，所以这边提一个新的issue。我们的使用场景是asr之后进行纠错，这就意味着在使用的过程中会不断产生新的业务数据，因此需要不断地更新模型。现在的问题是，目前在28万条数据的情况下，我们训练一次需要将近50个小时，将…

123109 updated 9 months ago

shibing624/pycorrector #416

你好，我按照readme用提供的数据重新训练macbert想测试模型是否能达到相似精度，在跑到第一个epoch的71%左右会显示爆显存torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 672.00 MiB (GPU 0; 12.00 GiB total capacity; 9.67 GiB already allo…

qni2 updated 1 year ago

shibing624/pycorrector #408

运行训练脚本时报错 `validation_epoch_end` has been removed in v2.0.0…

### Describe the Question 运行pycorrector/macbert/train.py时，报错。pytorch-lightning 版本2.0.4 。不想降低pytorch-lightning 版本，该如何解决该问题。谢谢！详细报错信息如下： ``` 2023-07-21 18:27:55.745 | INFO | __main__:main:…

vigorous2008 updated 1 year ago

migraphx-benchmark/AMDMIGraphX #162

Check top huggingface models

There are a bunch of models in huggingface that would be good to test if it compiles and accurate. The most downloaded onnx models would be a good start: https://huggingface.co/models?library=onnx&so…

attila-dusnoki-htec updated 6 months ago

shibing624/pycorrector #341

macbert怎么在作者训练好的模型pytorch_model.bin的基础上继续训练

自己从头训练的模型效果不佳，想请问一下如何在作者给出的已经训练好的模型的基础上继续训练

hongge778 updated 1 year ago

shibing624/pycorrector #295

使用kenlm规则纠错的三个小建议

1.源码只支持对句子中第一次出现的混淆集或者专有名词进行改变，因为sentence.find() 只会返回句子中第一次出现的下标，希望可以修改为对出现的所有的的混淆集或者专有名词进行改变。 2.源码只支持长度相等字符的替换，将不对等字数替换后后面的替换会出现错位现象。原因为将长度不对等字符替换后句子已变为替换后的句子，此时之前detect到的候选错误下标已发生改变，后续若还按照之前的下标进行…

wangdabee updated 1 year ago

shibing624/pycorrector #358

ASR中文文本纠错

最近在做ASR中文文本纠错，手里有大概30万条数据，尝试了各种开源的方法，都没有好的效果，想请教作者一些问题。 1、ASR待纠错文本是一些偏口语化的文本，有些是重复字，并且带有较多的【额、呃、呢、啊、哎、哦】等，还有一些句式是不完整的【可能是模型识别的原因或者讲话者本身的问题】。这些我自己没办法修正，因为讲话者本身就是这样说的。我使用了macbert和T5效果都不好，误纠的数据量远大于纠正确的数…

wnntju updated 1 year ago

shibing624/pycorrector #350

macbert预训练的时候报错

我在使用自己的数据集进行训练时出现了如下问题“ValueError: Expected input batch_size (7008) to match target batch_size (6976).”。我所训练的数据集格式如下，自认为和readme中的示例并无差异，因此想要询问我应当从哪方面入手修正这个问题？非常感谢！ ` { "id": "5985", …

F-crystal updated 1 year ago

42 results
for macbert4csc