teragrep / pth_10

Data Processing Language (DPL) translator for Apache Spark
GNU Affero General Public License v3.0
0 stars 2 forks source link

Create command to save a pattern for bloom search #235

Open elliVM opened 4 months ago

elliVM commented 4 months ago

Pattern is used by archive to enable a bloom filter search if the filter contains a pattern of the keyword.

elliVM commented 4 months ago

Created prototype version where pattern is given in config and assigned to a filter.

elliVM commented 3 months ago

Changing to multiple patterns per filter

elliVM commented 3 months ago

patterns are now saved into bloomdb.filtertype table

elliVM commented 3 months ago

Testing in QA with new pth-06 version

elliVM commented 3 months ago

planned improvements:

elliVM commented 2 months ago

updating to newest version of pth_06 with pattern acceleration

elliVM commented 1 month ago

Testing in QA

elliVM commented 1 month ago

Fixed some SQL syntax errors and moved filter creation to run before spark batch job. Change create table to include if not exists clause.

elliVM commented 1 month ago

filtertype pattern field had whitespaces ascii 032 nonprinting spacing character on both ends, causing errors when selecting pattern when saving bloomfilters, will trim those away.

Fixed with trimming

elliVM commented 1 month ago

Internal PR

elliVM commented 1 month ago

some Fixes to PR

elliVM commented 1 week ago

Fixes to testing and some minor changes, working now in QA