FLAIR-THU / VFLAIR

THU-AIR Vertical Federated Learning general, extensible and light-weight framework
67 stars 18 forks source link

FedBCD_p implementation #14

Open xavierxav opened 2 weeks ago

xavierxav commented 2 weeks ago

Hello, thank you for your work on VFLAIR platform.

It seems to me that the current implementation of fedBCD is incorrect. The same gradients are applied at each local iteration of passive parties (lines 193-212 in MainTaskVFL.py).

ZixuanGu commented 2 weeks ago

Thank you for your question!

In the first FedBCD iteration, we go through code line 195-202 to run a normal VFL forward and backward.

In the following Q-1 FedBCD iterations, we go through line 203-212 where each parties update their own model without information exchange. Therefore, only active party can update its gradients using line 210, while passive parties only use the same gradients transferred from active party. However this is not the final gradients used for model update. Refer to the [local_backward()] function in ./src/party/party.py, the [self.local_gradient] in [class Party()] is unchanged. However, as [self.local_pred] is updated in MainTaskVFL.py line 206, the final gradients used for model update [self.weights_grad_a] is updated.

xavierxav commented 2 weeks ago

Hello, thank you for your answer.

I understand that the gradient is affected by the local_pred (lines 348-353 for VFL without attack our defense). However from my understanding an implementation faithful to the original FedBCD paper (FedBCD: A Communication-Efficient Collaborative Learning Framework for Distributed Features) would also change self.local_gradient based on staled intermediate results and the local derivative of the loss function.

This why FedBCD is restricted to "models such as linear and logistic regression, and support vector machines" where the local derivatives of loss function are easily calculable.

Annotation 2024-06-20 125728