speed / newcrawler

Free Web Scraping Tool with Java
http://www.newcrawler.com
584 stars 115 forks source link

请问这个嵌入iframe是怎么解决跨域问题的呢? #89

Open stormstone opened 5 years ago

stormstone commented 5 years ago

把待爬取的页面通过iframe嵌入进来,是怎样给页面里的元素添加属性和控制的呢?

speed commented 5 years ago

代理,变成同域

stormstone commented 5 years ago

代理,变成同域

我不太了解这方面,有什么学习资料吗,希望能指点一二,万分感谢!

stormstone commented 5 years ago

找到了nginx反向代理的方法,谢谢指点!

    server {
        listen       8090;
        server_name  localhost;

        # 爬虫前端页面代理
        location ^~ /spider {
            rewrite ^/spider/(.*)$ /$1 break;
            root html/spider-front;
        }

        # 待爬取网站代理
        location / {
            add_header 'Access-Control-Allow-Origin' $http_origin;
            add_header 'Access-Control-Allow-Credentials' 'true';
            proxy_pass http://www.baidu.com;
        }
}