daphne-eu / daphne

DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines
Apache License 2.0
67 stars 62 forks source link

[BUGFIX] Inner join running out of memory #636

Open DamianDinoiu opened 1 year ago

DamianDinoiu commented 1 year ago

Root cause: Due to an intensive call of get() function on frames the program was running out of memory for values bigger than 10^6 (number of rows of the final result).

Solution: Decouple the logic in order to only call the get() function only once for the intersection and once for each column while composing the result.