talexu / HTDAIS

Hot Topic Data Analysis and Identification System
3 stars 0 forks source link

Demo of cx-extrator #2

Closed talexu closed 10 years ago

talexu commented 10 years ago

利用cx-extrator从html提取正文

talexu commented 10 years ago

理解算法,并优化,保持换行符,提取图片