Kyligence / spark

customized spark for KAP use, checkout kyspark branch
Apache License 2.0
4 stars 51 forks source link

fix block miss when df reusered #13

Open hn5092 opened 5 years ago

hn5092 commented 5 years ago

构建tpch50使用小规模数据的时候发现,当executor内存占用太多被yarnkill掉时,会导致任务失败,找不到broadcast

hn5092 commented 5 years ago

Root Cause: 构建的时候会复用dataframe, 因为spark plan会生成一些broadcast,我们之前优化qps的时候会去删除这些broadcast,导致再次使用的时候会报错,这里还有个疑点就是为什么要挂了才会触发,

Fix Design evidence: 使用小内存构建大的任务,导致其被yarn kill了之后重新运行能通过

Dev test evidence, Test cover by UT/IT ? it 无法覆盖

QA needed ? (Y/N) N

Test suggestions (other feature affected) to QA

hn5092 commented 5 years ago

ke issue https://github.com/Kyligence/KAP/issues/9792