Does PIPs process each of the S=8 frames independently, without sharing any information between frames?

aharley / pips

Particle Video Revisited

MIT License

571 stars 51 forks source link

Does PIPs process each of the S=8 frames independently, without sharing any information between frames? #21

Closed ZHAOZHIHAO closed 1 year ago

ZHAOZHIHAO commented 1 year ago

Hi,

I read the paper and code. It seems that PIPs processes the S=8 frames independently, without sharing any information between these frames?

As in Figure 2, the three steps "Initialize positions and appearance features", "Measure local similarity", and "Update positions and features" seem never blend information between different frames.

Best,

aharley commented 1 year ago

The key you might be missing is the MLP-Mixer. Half of its parameters apply within-frame operations, and the other half apply cross-frame operations.

ZHAOZHIHAO commented 1 year ago

Thanks for the clarification!