I agree with the intuition of the argument, but it is not necessarily true that each iteration will produce a split of an unused variable. I.e. if variable X(1) is used for f(1) in the first iteration, then X(1) will not be used in f(2). However, it could be used in f(3). In the end, there can be multiple f(b)'s that use X(j).
The final summation statement is true if f(j) is the summation of all f-hats of variable X(j).
I.e. each f(j) = sum[I(X(j) = Xb)*f-hat(Xb)] over all the iterations of boosting, which can be (much) greater than p.
Agree with the idea. In addition, I think that X(1) can be used even for consecutive splits, i.e. after X(1) is used for f(1), it can be used for f(2) splits right after.
I agree with the intuition of the argument, but it is not necessarily true that each iteration will produce a split of an unused variable. I.e. if variable X(1) is used for f(1) in the first iteration, then X(1) will not be used in f(2). However, it could be used in f(3). In the end, there can be multiple f(b)'s that use X(j).
The final summation statement is true if f(j) is the summation of all f-hats of variable X(j). I.e. each f(j) = sum[I(X(j) = Xb)*f-hat(Xb)] over all the iterations of boosting, which can be (much) greater than p.
Excuse my poor math notation here