Finding the direction of a solution required fixing up #51 and #52 as well as making TBranches much lazier objects, and some additional moving around of when TFiles are opened, etc.
I think there's still more to gain but this yielded a nice 2x improvement on processing in a 24GB flat ntuple in our analysis and reduces "thread-joins" in the higher level spark processing workflow. Performance compared to the Vandy cluster on root will need to be established to understand things under some sort of baseline. I'm pretty sure more than 2x is possible.
Please note this (my) code is horrible and not at all well optimized, but is meant as an attempt to get things in the right places/shapes.
~There's one more exception I need to follow up, somehow it's finding non-monotonic basket entries but I thought I got that threaded through OK.~ This last exception has been fixed wasn't properly dealing with empty baskets you see in some files.
I'll post some notes on laurelin master not working tomorrow or Monday. That one seems to be truncated arrays or something.
Finding the direction of a solution required fixing up #51 and #52 as well as making TBranches much lazier objects, and some additional moving around of when TFiles are opened, etc.
All the edits to code, not so many in the end, are here (which is based on laurelin 0.3.0): https://github.com/spark-root/laurelin/compare/master...lgray:topic_scaleout_and_laziness?expand=1
I think there's still more to gain but this yielded a nice 2x improvement on processing in a 24GB flat ntuple in our analysis and reduces "thread-joins" in the higher level spark processing workflow. Performance compared to the Vandy cluster on root will need to be established to understand things under some sort of baseline. I'm pretty sure more than 2x is possible.
Please note this (my) code is horrible and not at all well optimized, but is meant as an attempt to get things in the right places/shapes.
~There's one more exception I need to follow up, somehow it's finding non-monotonic basket entries but I thought I got that threaded through OK.~ This last exception has been fixed wasn't properly dealing with empty baskets you see in some files.
I'll post some notes on laurelin master not working tomorrow or Monday. That one seems to be truncated arrays or something.