The AST generated for aggregate functions has extra reduction functions. Their output is not assigned to any variable but ParallelAccelerator does not remove them. I think remove_no_deps needs some thought to handle this.
This can have performance impact since they eventually become MPI_Allreduce.
tables_cat.jl in HPAT/test is a good simple example for this where extra sum() is generated, lowered to reduce() and eventually a parfor.
The AST generated for aggregate functions has extra reduction functions. Their output is not assigned to any variable but ParallelAccelerator does not remove them. I think remove_no_deps needs some thought to handle this.
This can have performance impact since they eventually become MPI_Allreduce.
tables_cat.jl in HPAT/test is a good simple example for this where extra sum() is generated, lowered to reduce() and eventually a parfor.