Question regarding architecture

GRIGORR commented 4 years ago

Hi. Small question regarding the architecture. Do I understand correctly that at each iteration the candidate networks are trained separately, then each of them is added to Adanet, weighting the logits is learned and then the best one is chosen? Thanks in advance.

le-dawg commented 4 years ago

I understan it like that too.

cweill commented 4 years ago

@GRIGORR That is correct: all the candidate subnetworks (and their associated ensemble) are trained in parallel in the same TensorFlow graph. At the end of each iteration, the best subnetwork is chosen based on its performance within the ensemble.

le-dawg commented 4 years ago

@GRIGORR That is correct: all the candidate subnetworks (and their associated ensemble) are trained in parallel in the same TensorFlow graph. At the end of each iteration, the best subnetwork is chosen based on its performance within the ensemble.

From what I gather from the 0.8.0 docs it sounds to me like one AdaNet iteration actually selects a complete Ensemble each iteration and discards the others. Could it be said that each Ensemble from the candidate ensemble set differs from all the other candidate Ensembles in the subnetwork that has been added to it in the current iteration?

tensorflow / adanet

Question regarding architecture #147