Do the meta classifiers (learners) learn on the predicted probabilities of the base learners or the class labels predicted by the base learners?
If the prediction is based purely on class labels, then there will be only 2N distinct feature vectors for the meta-learner where N is the number of base learners. Since there are several training samples, so, it may happen that a unique feature vector for the meta learner may map to multiple output classes. So, I believe training the meta-learner on the predicted probabilities of the base learner is more intuitive. Which one of the above does StackNet use?
Do the meta classifiers (learners) learn on the predicted probabilities of the base learners or the class labels predicted by the base learners? If the prediction is based purely on class labels, then there will be only 2N distinct feature vectors for the meta-learner where N is the number of base learners. Since there are several training samples, so, it may happen that a unique feature vector for the meta learner may map to multiple output classes. So, I believe training the meta-learner on the predicted probabilities of the base learner is more intuitive. Which one of the above does StackNet use?