I3D Modifications - Githubissues

ande8331 commented 3 years ago

Hi, I found this paper very interesting. I wanted to try out a different dataset to see the performance for a problem I'm researching. So in the process of trying to load a different dataset through it, I've been struggling with the I3D preprocessing for at least a few days. I've read through the comments in issue #5 quite a bit and understand the overall idea, but the actual code for how to alter I3D to get the data out, apply ROI Align and put the data back through is not real obvious to me; feels like I'm messing with a black box (specifically; after ROI Align is applied, RGB/Flow seems irrelevant as the model appears to be combined out of the 4f layer, but I can't figure out how to work around that either).

Any chance you still have the modified I3D files around that you could share?

matteot11 commented 3 years ago

Hi! Thanks for your interest in our work! Could you please explain a little more what you mean by: "After ROI Align is applied, RGB/Flow seems irrelevant as the model appears to be combined out of the 4f layer, but I can't figure out how to work around that either"? If I understand, your problem is how to obtain intermediate features from Mixed 4f layer, on which RoiAlign should be applied. You could build two different torch.nn.Module(), one containing layers up to Mixed 4f, and the other containing the remaining layers. You could then obtain the output of the first Module, apply RoiAlign on it, and then forward through the second Module (if necessary). Waiting for your clarifications!

Matteo

ande8331 commented 3 years ago

Thanks for the quick response Matteo! So what you said is what I'm trying to do, but I'm newer to the ML world so I'm slowly fumbling my way through this... I cloned the I3D git project, I run my data through the 4f layer and save it off; apply the ROI Align elsewhere. Then I'm trying to reload the data and push it through the rest of the I3D layers, which is where I'm having trouble, the I3D loaders are built around taking two different data inputs and checkpoint files (flow and RGB), but now I'm giving it just one data input and I'm not sure which checkpoint I should be telling it to load?

I get the feeling you may have gone about this a different way from your comment?

matteot11 commented 3 years ago

Could you please give me the link (if any) to the public code you are using for I3D? Yes, I proceeded in a slightly different way: I did not store intermediate features (on which RoiAlign should be computed), instead I modified the I3D implementation by adding a RoiAlign layer between Mixed 4f and Mixed 5a. I only saved the final features before the classification layer. If the implementation you are using needs both RGB and flow (data and checkpoints), it shouldn't be hard to modify it to RGB only (or flow only, based on your needs).

Matteo

ande8331 commented 3 years ago

So these are the files that I've been trying to modify: https://github.com/deepmind/kinetics-i3d/blob/master/evaluate_sample.py https://github.com/deepmind/kinetics-i3d/blob/master/i3d.py

Changing it to RGB only shouldn't be an issue - I had been assuming you were using both, it didn't occur to me that there might be more than one implementation of I3D... :)

matteot11 commented 3 years ago

Oh sorry, I supposed you were using PyTorch! I used the following PyTorch implementation for I3D: https://github.com/piergiaj/pytorch-i3d

I used only RGB input for my models, not flow. Maybe you could do the same if you are at the beginning :) Maybe adding a RoiAlign layer with little more modifications after the Mixed 4f layer should do the work.

aimagelab / STAGE_action_detection

I3D Modifications #10