asepaprianto / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Not able to get javascript related files in web url list #217

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Just crawl a website which is hosted in localhost:8080 which should have .js 
file imported. 
2. I am able to get css and image files in outgoing URL but not able to get .js 
url.
3. Also in the last element jsession id is getting appended, I dont want this, 
can you please help

What is the expected output? What do you see instead?
I want all the outgoing URLs in the page.

What version of the product are you using?
crawler4j3.5

Please provide any additional information below.

Perfomance dropped drastically when I deplyed the crawler in one server and 
tried crawling other application deployed on other server.

Original issue reported on code.google.com by nagbhush...@gmail.com on 30 Apr 2013 at 12:16

GoogleCodeExporter commented 9 years ago
Please supply a real life example so I can test against it.

Original comment by avrah...@gmail.com on 23 Sep 2014 at 2:10