How to stack embedding and pass the gradients?

arita37 commented 2 years ago

Have a 2 neural nets N1, N2, want to stack their output embedding layer.

How to do this ?

xuyxu commented 2 years ago

Suppose the output of N1 and N2 on a sample x is N1(x) and N2(x) separately, what you want is to concatenate their output (i.e., [N1(x), N2(x)]), and pass it to downstream layers, right?

arita37 commented 2 years ago

Exactly: Either concat, or Mean

We need to pass the gradient and end to end training

On Feb 23, 2022, at 16:19, Yi-Xuan Xu @.***> wrote:

Suppose the output of N1 and N2 on a sample x is N1(x) and N2(x) separately, what you want is to concatenate their output (i.e., [N1(x), N2(x)]), and pass it to downstream layers, right?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

xuyxu commented 2 years ago

Hi, if you are going to take the mean of outputs from all base estimators, the fusion ensemble is exactly what you want.

As to the concatenation, it is somehow weird since all base estimators in the ensemble are doing the same thing, making concatenating their outputs kind of useless. Is there any paper or technical report demonstrating the effectivenss of concatenating outputs of base estimators?

arita37 commented 2 years ago

N1, N2,Nx are different NN models.

We aggregate through concat their embedding output.

BigX = [ X1,….Xn] and feed into another NN (ie merging).

This extensively used (ie Siamese Network…)

On Feb 23, 2022, at 23:54, Yi-Xuan Xu @.***> wrote:

Hi, if you are going to take the mean of outputs from all base estimators, the fusion ensemble is exactly what you want.

As to the concatenation, it is somehow weird since all base estimators in the ensemble are doing the same thing, making concatenating their outputs kind of useless. Is there any paper or technical report demonstrating the effectivenss of concatenating outputs of base estimators?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

arita37 commented 2 years ago

We are NOT dealing with output !!!

Output is kind of useless for End to End training…

We are dealing witht the last embedding.

On Feb 24, 2022, at 9:41, No Ke @.***> wrote:

N1, N2,Nx are different NN models.

We aggregate through concat their embedding output.

BigX = [ X1,….Xn] and feed into another NN (ie merging).

This extensively used (ie Siamese Network…)

On Feb 23, 2022, at 23:54, Yi-Xuan Xu @.***> wrote:

Hi, if you are going to take the mean of outputs from all base estimators, the fusion ensemble is exactly what you want.

As to the concatenation, it is somehow weird since all base estimators in the ensemble are doing the same thing, making concatenating their outputs kind of useless. Is there any paper or technical report demonstrating the effectivenss of concatenating outputs of base estimators?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

xuyxu commented 2 years ago

Thanks for your kind explanation. Heterogeneous ensemble is not supported yet, since we have not come up with a succinct way on setting different optimizers for different base estimators 😢.

arita37 commented 2 years ago

Sure.

At 1st version, Maybe, we can use same optimizer, scheduler for the ensemble model

Goal is to have a one liner for easy Ensemble End to End.

On Feb 24, 2022, at 10:04, Yi-Xuan Xu @.***> wrote:

Thanks for your kind explanation. Heterogeneous ensemble is not supported yet, since we have not come up with a succinct way on setting different optimizers for different base estimators 😢.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

xuyxu commented 2 years ago

Kind of busy these days, will appreciate a PR very much ;-)

TorchEnsemble-Community / Ensemble-Pytorch

How to stack embedding and pass the gradients? #108