zhegexiaohuozi / SeimiCrawler

一个简单、敏捷、分布式的支持SpringBoot的Java爬虫框架;An agile, distributed crawler framework.
http://seimicrawler.org
Apache License 2.0
1.98k stars 681 forks source link

抓取头条这样不行吗 #54

Closed LyingDragons closed 4 years ago

LyingDragons commented 4 years ago

image

zhegexiaohuozi commented 4 years ago

提取URL肯定没问题,但是你要考虑下你拿到的原始html是什么,这点很重要

LyingDragons commented 4 years ago

嗯 头条是页面是js