hiDaDeng / hidadeng.github.io

大邓的个人博客,博客域名在下方, 访问可能有点慢啊。
https://textdata.cn/
5 stars 2 forks source link

blog/cntext_tutorial/ #5

Open utterances-bot opened 2 years ago

utterances-bot commented 2 years ago

cntext库 | Python文本分析包更新 | 大邓和他的PYTHON

扩展词典、情感分析、可阅读性,内置9种情感词典,涵盖中英文

https://hidadeng.github.io/blog/cntext_tutorial/

swunicx commented 2 years ago

邓老师这里不对吧,text1和text2一模一样。我尝试把text2里的逗号去除,的确会影响文本可读性。

“句子中的符号变更会影响结果

text2 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text2, lang='chinese') Run

{'readability1': 27.0, 'readability2': 0.16666666666666666, 'readability3': 13.583333333333334}”

hiDaDeng commented 2 years ago

邓老师这里不对吧,text1和text2一模一样。我尝试把text2里的逗号去除,的确会影响文本可读性。

“句子中的符号变更会影响结果

text2 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text2, lang='chinese') Run

{'readability1': 27.0, 'readability2': 0.16666666666666666, 'readability3': 13.583333333333334}”

有无标点符号,导致可阅读性存在差异,这是正常的。

swunicx commented 2 years ago

呃,老师我的意思您这里的text1和text2是一样的,但是输出的可读性不一样。您看看是不是我眼拙:

text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。'

ct.readability(text1, lang='chinese') Run

{'readability1': 28.0, 'readability2': 0.15789473684210525, 'readability3': 14.078947368421053}

句子中的符号变更会影响结果

text2 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text2, lang='chinese') Run

{'readability1': 27.0, 'readability2': 0.16666666666666666, 'readability3': 13.583333333333334}

hiDaDeng commented 2 years ago

同学,我刚刚把text1例句直接复制粘贴到新的位置,改变量名为text2,跑出来是一样的。会不会是版本问题,我这里用的cntext版本为1.7.0。

import cntext

cntext.__version__

Run

1.7.0

代码1

import cntext as ct

text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。'

ct.readability(text1, lang='chinese')

Run

{'readability1': 28.0,
 'readability2': 0.15789473684210525,
 'readability3': 14.078947368421053}


代码2

import cntext as ct

text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。'

ct.readability(text2, lang='chinese')

Run

{'readability1': 28.0,
 'readability2': 0.15789473684210525,
 'readability3': 14.078947368421053}
hiDaDeng commented 2 years ago

https://github.com/hiDaDeng/hidadeng.github.io/issues/5#issuecomment-1143283029

刚刚回复的代码运行截图 https://hidadeng.github.io/blog/cntext_tutorial/img/readability.png

swunicx commented 2 years ago

对啊老师,text1复制粘贴到text2应该跑出来的readability结果是一样的,比如readability3都是14.078947368421053。但您的博文(https://hidadeng.github.io/blog/cntext_tutorial/)里相同的text1和text2却出现了不一样的readability值,readability3值一个是14.078947368421053,一个是13.583333333333334。以下是对您博文的截图

------------------ 原始邮件 ------------------ 发件人: "hiDaDeng/hidadeng.github.io" @.>; 发送时间: 2022年6月1日(星期三) 下午5:15 @.>; @.**@.>; 主题: Re: [hiDaDeng/hidadeng.github.io] blog/cntext_tutorial/ (Issue #5)

同学,我刚刚把text1例句直接复制粘贴到新的位置,改变量名为text2,跑出来是一样的。会不会是版本问题,我这里用的cntext版本为1.7.0。 import cntext cntext.version

Run 1.7.0
代码1 import cntext as ct text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text1, lang='chinese')

Run {'readability1': 28.0, 'readability2': 0.15789473684210525, 'readability3': 14.078947368421053}

代码2 import cntext as ct text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text2, lang='chinese')

Run {'readability1': 28.0, 'readability2': 0.15789473684210525, 'readability3': 14.078947368421053}
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

hiDaDeng commented 2 years ago

我昨天根据你issue,已经重新跑了一遍,text1和text2跑出来结果是一样的。至于博文我抽空改一下

---Original--- From: @.> Date: Thu, Jun 2, 2022 11:43 AM To: @.>; Cc: @.**@.>; Subject: Re: [hiDaDeng/hidadeng.github.io] blog/cntext_tutorial/ (Issue #5)

对啊老师,text1复制粘贴到text2应该跑出来的readability结果是一样的,比如readability3都是14.078947368421053。但您的博文(https://hidadeng.github.io/blog/cntext_tutorial/)里相同的text1和text2却出现了不一样的readability值,readability3值一个是14.078947368421053,一个是13.583333333333334。以下是对您博文的截图

------------------ 原始邮件 ------------------ 发件人: "hiDaDeng/hidadeng.github.io" @.>; 发送时间: 2022年6月1日(星期三) 下午5:15 @.>; @.**@.>; 主题: Re: [hiDaDeng/hidadeng.github.io] blog/cntext_tutorial/ (Issue #5)

同学,我刚刚把text1例句直接复制粘贴到新的位置,改变量名为text2,跑出来是一样的。会不会是版本问题,我这里用的cntext版本为1.7.0。 import cntext cntext.version

Run 1.7.0
代码1 import cntext as ct text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text1, lang='chinese')

Run {'readability1': 28.0, 'readability2': 0.15789473684210525, 'readability3': 14.078947368421053}

代码2 import cntext as ct text1 = '如何看待一网文作者被黑客大佬盗号改文,因万分惭愧而停更。' ct.readability(text2, lang='chinese')

Run {'readability1': 28.0, 'readability2': 0.15789473684210525, 'readability3': 14.078947368421053}
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>