xrma / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Wrong html downloaded #193

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.Enter the should visit domain as "http://www.moca.org"
2.In the controller add seed as 
"http://www.moca.org/museum/pc_browse_collection_by_artist.php?do=list"
3. Run the code

What is the expected output? What do you see instead?
The html obtained in the page object I expect to see is what I would see if i 
right click on the seed page in chrome and click view source. But what i see 
stored in the page object instead is attached below

What version of the product are you using?
3.3

Please provide any additional information below.

Original issue reported on code.google.com by ryandsou...@gmail.com on 9 Feb 2013 at 9:28

Attachments:

GoogleCodeExporter commented 9 years ago
This page is detecting you're not a normal browser and is redirecting to 
http://www.moca.org/org/moca/flash_check.php?do=list&pageBefore=%2Fmuseum%2Fpc_b
rowse_collection_by_artist.php and this is the content you're seeing.

Original comment by ganjisaffar@gmail.com on 3 Mar 2013 at 3:46