Open whqwill opened 5 years ago
Hi @whqwill , can you please specify your needs in detail. Thanks :)
I mean how it selects the important parts as the 'main text' and if possible any comparison with other methods. @Ask149
Not exactly for this newspaper lib, but the slides in this link is very useful overview of the problem: Boilerplate Detection using Shallow Text Features http://www.l3s.de/%7Ekohlschuetter/boilerplate/
Oh, it is helpful for me. Thanks.
Haiqing
bact notifications@github.com 于2019年1月22日周二 上午11:28写道:
Not exactly for this newspaper lib, but the slides in this link is very useful overview of the problem: Boilerplate Detection using Shallow Text Features http://www.l3s.de/%7Ekohlschuetter/boilerplate/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/codelucas/newspaper/issues/665#issuecomment-456259207, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCjdNxirdbTa-jTvWVcJZlEzDxpFxk8ks5vFoVJgaJpZM4Z0uAq .
any paper or algorithm description about text extraction? I want to know its theory details, thanks