TopEFT / topeft

15 stars 24 forks source link

Start move to dask histEFT #422

Open btovar opened 2 months ago

btovar commented 2 months ago

@bryates This is as far as I can confidently go. I tried to update the code in corrections.py, but it involves knowing the dimensions of the arrays with regards to physics, so I didn't want to introduce errors there.

The rough roadmap you want to follow is to replace numpy with dask.array, and awkard with dask_awkward. Also, the only time you should call dask.compute is in run_analysis.

My first roadblock in corrections.py is to create a random matrix with similar dimensions as a dask_awkward array. (Looking at the code, I think you don't need to generate the full matrix?) The issue is that the original code used to flatten/unflatten arrays, and such operations don't really make sense in dask when the arrays are not known. I think that probably someone that understand the physics can change that computation to use dak.mask, etc.

bryates commented 2 months ago

@bryates This is as far as I can confidently go. I tried to update the code in corrections.py, but it involves knowing the dimensions of the arrays with regards to physics, so I didn't want to introduce errors there.

The rough roadmap you want to follow is to replace numpy with dask.array, and awkard with dask_awkward. Also, the only time you should call dask.compute is in run_analysis.

My first roadblock in corrections.py is to create a random matrix with similar dimensions as a dask_awkward array. (Looking at the code, I think you don't need to generate the full matrix?) The issue is that the original code used to flatten/unflatten arrays, and such operations don't really make sense in dask when the arrays are not known. I think that probably someone that understand the physics can change that computation to use dak.mask, etc.

Thanks @btovar! I agree with you that the randomness in the Rochester can be done better. We can take a look at other things as well.

btovar commented 2 months ago

The futures test was not removed, it was renamed to test_dask. There is not a "futures" executor in the new coffea.

bryates commented 2 months ago

The futures test was not removed, it was renamed to test_dask. There is not a "futures" executor in the new coffea.

Yes, that's what I meant to put, thanks!