how to access crawled data

GoogleCodeExporter commented 9 years ago

Hello!

I got my crawler working and i see my desired results, but: how can i access 
the .jdb files? i couldn't find information on what utility is used to store 
the crawled urls.

i can open it with notepad(++) but this doesnt seem to be the right fit. is 
there a database editor/viewer that can handle jdb files, for instance for 
querying? or is the intention of the .jdb file only provide information for the 
crawl process itself? do i have to retrieve (and store) the information i need 
myself?

thanks a lot, i really appreciate your work yasser!

What version of the product are you using? On what operating system?
crawler4j-2.6.1, win7-64

Original issue reported on code.google.com by mrks...@gmail.com on 9 Nov 2011 at 11:09

GoogleCodeExporter commented 9 years ago

You shouldn't touch the .jdb files. They are used in the crawling process. The 
sample codes show how you can get the required information.

-Yasser

Original comment by ganjisaffar@gmail.com on 10 Nov 2011 at 6:58

Changed state: Invalid
Added labels: Type-Other
Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

Hi Yasser, 

I have a requirement to crawl web pages, internet search told me crawler4j is 
the popular one. As I am new to this I am not very familiar with the way this 
works. I am unable to see my crawl results. Would appreciate if you could shed 
some light in this regard. Thanks in advance.

Best,
Ag

Original comment by gopinath...@gmail.com on 8 Jun 2013 at 6:44

mohankreddy / crawler4j

how to access crawled data #92