terascope / standard-assets

WIP: experimental standard asset library for Teraslice
MIT License
0 stars 0 forks source link

Finalize merging common_processors into Standard-assets #924

Open ciorg opened 1 month ago

ciorg commented 1 month ago

We've had a slow ongoing process of moving the processors in common_processors (an internal asset) to standard-assets. There are only a few left that would still be useful to move to standard assets.

* update
* date_guard
* count_unique
* copy_id
* filter_by_required_fields
* add_short_id
* dropdoc
* json_parse
jsnoble commented 1 month ago

additional processors:

ciorg commented 1 month ago

After meeting today I was reminded that dropdoc should be renamed to sample or something along those lines. Right now it's easy to get confused because when you increase the dropdoc value you get less data, if it was sample then the percentage value would be the percent of records being kept and passed through.

Also, update is a poor name for the processor - it updates some fields and uses caching to reduce the strain of having es update the fields. But it should have a name that better represents what it does.