Masking is currently implemented by transforming the result set post-query where the masked field appears in the output data, or by preventing access to the data if used in the predicate
Hive instead pushes down an expression to the database engine which allows predicates to still be used. For example a masking operation might convert a fine-grained date (YYYYMMDD) into simply a year, which could still then be used in range-based predicates, like showing data between 2000-2010. Or a mask function might generate a hash, which could then be used for cross-checking without exposing the actual data value.
This likely requires changes to gaian and needs further research, but better mirrors the approach taken more generally .
Masking is currently implemented by transforming the result set post-query where the masked field appears in the output data, or by preventing access to the data if used in the predicate
Hive instead pushes down an expression to the database engine which allows predicates to still be used. For example a masking operation might convert a fine-grained date (YYYYMMDD) into simply a year, which could still then be used in range-based predicates, like showing data between 2000-2010. Or a mask function might generate a hash, which could then be used for cross-checking without exposing the actual data value.
This likely requires changes to gaian and needs further research, but better mirrors the approach taken more generally .