Closed alex-spies closed 2 days ago
Pinging @elastic/es-analytical-engine (Team:Analytics)
This reproduces very nicely when the query does not discard the field that was join on; this works as reproducer in the csv test:
FROM employees
| SORT emp_no
| LIMIT 3
| EVAL language_code = languages
| LOOKUP JOIN languages_lookup ON language_code
| KEEP emp_no, language_code, language_name
;
The output computation for
LOOKUP JOIN
is still under construction, so this is not unexpected; c.f. meta issue https://github.com/elastic/elasticsearch/issues/116208.However, currently we compute the expected output of a
LOOKUP JOIN
in a specific method in the Analyzer - which conflicts with the computation inside the Join class. This leads to inconsistencies. I believe there really should be only one place where the computation occurs, ideally centralized in theJoin
class so it's not spilled into the analyzer.Reproducer: