scorelab / OpenXDR

Real-time Opensource Extended Detection And Response System
Apache License 2.0
10 stars 13 forks source link

Evaluate Succinct to use Spark as a document store #6

Closed sameeravithana closed 2 years ago

sameeravithana commented 8 years ago

Succinct http://succinct.cs.berkeley.edu/wp/wordpress/ [Data-source] -> Spark Streaming -> {RDD} Document store at Spark

sameeravithana commented 8 years ago

Any progress on this Milindu, expecting a wiki page of your findings

agentmilindu commented 8 years ago

Hi @SamTube405, I finished my analysis and pushed the wiki page to the repo.

sameeravithana commented 8 years ago

Hi @agentmilindu, This project also leverages the succinct data-structure [1] to compress RDDs, so the compression + indexing as a package was no more a surprise tool.

Also since we have streaming scenario Berkeley succinct pre-processing might take 0.4 ms to compress 1KB of data per core, also we won't have update scenario, so no worries there. Can you extend the pcap code to have succinct data-store, and provide on-the fly insights (counts, avg), better to have performance metrics

Nice to know succinct will extend to have graph features

REF: [1] https://en.wikipedia.org/wiki/Succinct_data_structure

agentmilindu commented 8 years ago

@SamTube405 Yeah, let's extend the pcap code to work with succinct data-store! :)

Ammoniya commented 2 years ago

archive