Gerapy GerapyAutoExtractor issues

Gerapy / GerapyAutoExtractor

Auto Extractor Module

https://pypi.org/project/gerapy-auto-extractor/

Apache License 2.0

321 stars 79 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Extract attachment

#27 mic1on closed 3 months ago
1
详情页的判断，可以根据正文的识别来判断

#26 oldsiks opened 4 months ago
0
计算titlle 相似性

#25 oldsiks opened 2 years ago
2
Bug of Gerapy Auto Extractor about similarity2

#24 wf4867612 opened 2 years ago
2
优化文本字数统计算法，兼容英文段落场景

#23 yjshi2015 closed 1 year ago
1
中文detail页面包含英文段落会导致识别准确度下降

#22 yjshi2015 opened 2 years ago
0
numpy版本问题

#21 Smawexi opened 2 years ago
1
numpy版本问题

#20 Smawexi closed 2 years ago
0
详情页只能提取到一个段落

#19 Germey opened 2 years ago
1
Bump lxml from 4.6.3 to 4.6.5

#18 dependabot[bot] closed 2 years ago
0
Bug of Gerapy Auto Extractor 安装时出现问题

#17 dota-player opened 3 years ago
3
报错了AttributeError: 'backports.zoneinfo.ZoneInfo' object has no attribute 'localize'

#16 gclsoft opened 3 years ago
0
https://www.econ.sdu.edu.cn/zxzx/tzgg.htm 类似这种带分类链接的能智能提取吗

#15 ieliwb opened 3 years ago
1
Bump lxml from 4.6.2 to 4.6.3

#14 dependabot[bot] closed 3 years ago
0
大佬，更新起来啊

#13 ieliwb opened 3 years ago
2
Bump lxml from 4.3.3 to 4.6.2

#12 dependabot[bot] closed 3 years ago
0
函数preprocess4content_extractor的bug

#11 zhutuo opened 3 years ago
0
can't remove element

#10 zhutuo opened 3 years ago
0
解析结果有问题

#9 Germey opened 4 years ago
0
建议增加一个传入xpath，缩小提取范围的功能

#8 JerryChenn07 opened 4 years ago
3
max() arg is an empty sequence

#7 perrornet closed 4 years ago
1
Extractor of author

#6 Germey opened 4 years ago
0
对于分页页面爬取的建议

#5 zheyuan2025 opened 4 years ago
0
Bug of Gerapy Auto Extractor 爬取论坛帖子时候出错

#4 bowu678 opened 4 years ago
1
How to distinguish whether the page is a list page or a detail page?

#3 shuguang101 closed 4 years ago
2
Wrong extract example

#2 Germey closed 4 years ago
2
Update README.md

#1 Insutanto closed 4 years ago
1