issues
search
Gerapy
/
GerapyAutoExtractor
Auto Extractor Module
https://pypi.org/project/gerapy-auto-extractor/
Apache License 2.0
321
stars
79
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Extract attachment
#27
mic1on
closed
3 months ago
1
详情页的判断,可以根据正文的识别来判断
#26
oldsiks
opened
4 months ago
0
计算titlle 相似性
#25
oldsiks
opened
2 years ago
2
Bug of Gerapy Auto Extractor about similarity2
#24
wf4867612
opened
2 years ago
2
优化文本字数统计算法,兼容英文段落场景
#23
yjshi2015
closed
1 year ago
1
中文detail页面包含英文段落会导致识别准确度下降
#22
yjshi2015
opened
2 years ago
0
numpy版本问题
#21
Smawexi
opened
2 years ago
1
numpy版本问题
#20
Smawexi
closed
2 years ago
0
详情页只能提取到一个段落
#19
Germey
opened
2 years ago
1
Bump lxml from 4.6.3 to 4.6.5
#18
dependabot[bot]
closed
2 years ago
0
Bug of Gerapy Auto Extractor 安装时出现问题
#17
dota-player
opened
3 years ago
3
报错了AttributeError: 'backports.zoneinfo.ZoneInfo' object has no attribute 'localize'
#16
gclsoft
opened
3 years ago
0
https://www.econ.sdu.edu.cn/zxzx/tzgg.htm 类似这种带分类链接的能智能提取吗
#15
ieliwb
opened
3 years ago
1
Bump lxml from 4.6.2 to 4.6.3
#14
dependabot[bot]
closed
3 years ago
0
大佬,更新起来啊
#13
ieliwb
opened
3 years ago
2
Bump lxml from 4.3.3 to 4.6.2
#12
dependabot[bot]
closed
3 years ago
0
函数preprocess4content_extractor的bug
#11
zhutuo
opened
3 years ago
0
can't remove element
#10
zhutuo
opened
3 years ago
0
解析结果有问题
#9
Germey
opened
4 years ago
0
建议增加一个传入xpath,缩小提取范围的功能
#8
JerryChenn07
opened
4 years ago
3
max() arg is an empty sequence
#7
perrornet
closed
4 years ago
1
Extractor of author
#6
Germey
opened
4 years ago
0
对于分页页面爬取的建议
#5
zheyuan2025
opened
4 years ago
0
Bug of Gerapy Auto Extractor 爬取论坛帖子时候出错
#4
bowu678
opened
4 years ago
1
How to distinguish whether the page is a list page or a detail page?
#3
shuguang101
closed
4 years ago
2
Wrong extract example
#2
Germey
closed
4 years ago
2
Update README.md
#1
Insutanto
closed
4 years ago
1