Open johnklee opened 5 years ago
C4J persists one type of data: the crawled URLs, currently in the embedded sleepycat DB. User's extension of WebCrawler can persist downloaded data and the visited URLs in every type of storage (or queue systems), it the if the implementations make it so.
As title. If I have a Kafka producer/consumer framework ready. Will crawler4j support us to configure Kafka setting so it can feed in the crawled result into exist Kafka topic?