opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
14 stars 23 forks source link

[FEATURE]Add `isEmpty` ppl command #512

Open YANG-DB opened 1 month ago

YANG-DB commented 1 month ago

Is your feature request related to a problem? Add isEmpty command for PPL spark based execution driver .

Since spark sql doesnt offer such expression a possible option would be:

Do you have any additional context?

kt-eliatra commented 3 weeks ago

@YANG-DB hi, just to confirm, isempty as well as ispresent and isblank - all of them should be implemented as commands? e.g source=table | isempty(column); Or rather as a functions? e.g. source=table | fields col_a, col_b | where isempty(col_c)?

YANG-DB commented 3 weeks ago

IMO functions - @LantaoJin what are your thoughts?

lukasz-soszynski-eliatra commented 1 week ago

@YANG-DB I have a question related to the edge case associated with ternary logic. What should be the outcome of invoking eval r = isempty(null)? According to the below definition

SELECT CASE  
       WHEN length(trim(column_name)) = 0 THEN true 
        ELSE false     
END AS is_empty

isempty(null) evaluates to false. Is this expected?

lukasz-soszynski-eliatra commented 1 week ago

The initial implementation is ready for review. The version contains the unresolved problem described in the above comment. @YANG-DB , please let me know what you think.

salyh commented 4 days ago

can we close this as https://github.com/opensearch-project/opensearch-spark/pull/676 is merged?