bejean / crawl-anywhere

Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration.
www.crawl-anywhere.com
Apache License 2.0
96 stars 38 forks source link

tools_test_scripts.sh never get to see any output of found links #79

Open bejean opened 9 years ago

bejean commented 9 years ago

When trying to test finding links action with tools_test_scripts.sh, I never get to see any output of found links. Even when hardcoding them. Neither does it show exceptions (after intentionally breaking code).

OkkeKlein commented 9 years ago

It appears that the broken XML prevented the script from finding links. So I see found links now, but never any exceptions. Even when I break the code.

OkkeKlein commented 9 years ago

I would consider not getting any exceptions a problem. How else can you debug your script?

bejean commented 9 years ago

Can you provide me an url to a "broken" page ?

OkkeKlein commented 9 years ago

The page is fixed and in production environment. But for testing you can take any webpage and break the script somewhere. It should give exception then, but I'm not getting it.

bejean commented 9 years ago

"break the script somewhere" ? Which script ?

OkkeKlein commented 9 years ago

The script that parses the URL.