In the current version of the dataset the candidates are not pT ordered, which might cause in some events with bigger multiplicity (16 in current config) to have higher pT cand to be left out. As for regression and DM classification we are using only signal samples and since signal samples have rather small particle multiplicity, this happens <1% of the time. While not having a big impact in the DM and energy regression training, it will have some more noticeable effect in tauID training, as qq jets have bigger particle multiplicity.
In the current version of the dataset the candidates are not pT ordered, which might cause in some events with bigger multiplicity (16 in current config) to have higher pT cand to be left out. As for regression and DM classification we are using only signal samples and since signal samples have rather small particle multiplicity, this happens <1% of the time. While not having a big impact in the DM and energy regression training, it will have some more noticeable effect in tauID training, as qq jets have bigger particle multiplicity.