Open pfackeldey opened 4 months ago
Interested 👍.
We (@AlexanderHeidelbach and I) recently took over the maintenance and development of b2luigi for the Belle II collaboration. Therefore we would be very interested in exchanging ideas and experiences on this topic and are looking for overlap or maybe opportunities for collaboration.
I'm interested in this topic, but with a particular focus on enabling greater flexibility & reusability (in the context of physics analyses) by addressing specific shortcomings in the underlying structures, in particular luigi
's handling of parameters and dependencies.
That means, I'm not particularly focused on the integration aspect, but rather the idea that what can/will be integrated should be iterated/improved upon before it is too late and to avoid unnecessary baggage.
Dear topic starter,
it would be great if you could expand on what you mean with that topic.
I make a talk on an architectural framework for data analysis in Python, therefore I'm generally interested in this theme.
It would be good to see some other examples of workflows and their comparisons (e.g. there are proposed discussions about dask, but there are also workload managers like Slurm).
Dear @ynikitenko, Typical HEP analysis (i.e. at the LHC) comprise a vast amount of steps with non-trivial dependencies between those. Here, one can use workflow tools, e.g. https://github.com/spotify/luigi, to describe & execute these steps and their dependencies. This is not directly related to the heavy batch processing that is typically done using e.g. Dask / HTCondor / Slurm as it represents only a subset of steps of a whole analysis.
Dear @pfackeldey , thank you for a nice example. Would you be so kind as to maybe adding it to the starting message for easier navigation?
https://github.com/HSF/PyHEP.dev-workshops/issues/31#issuecomment-2269133847: