Open ravirahman opened 8 years ago
Thanks Ravi. Would love it if you or someone else could post more up to date scrapers. A lot of people rely on these so it'd be nice if we could help more people out with up to date scrapers rather than keeping them all to ourselves..
Thanks!
It appears that BNCollege, and potentially others, use a javascript redirect to prevent scraping. Here is a workaround (on Android) using a webview:
WebView view;
Then in, for example,
onCreate
,view = new WebView(getApplicationContext()); view.getSettings().setJavaScriptEnabled(true); view.getSettings().setUserAgentString("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.1 Safari/537.36"); view.getSettings().setLoadsImagesAutomatically(true); CookieManager.getInstance().setAcceptCookie(true); view.loadUrl("http://milton.bncollege.com/webapp/wcs/stores/servlet/TBWizardView?catalogId=10001&langId=-1&storeId=82238"); CookieManager.getInstance().setAcceptCookie(true); CookieManager.getInstance().setAcceptThirdPartyCookies(view,true); start();
Then a timer to check when the javascript redirect is complete ` private Timer timer; private TimerTask timerTask = new TimerTask() {