salesforce / paprika

Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"
Apache License 2.0
46 stars 4 forks source link

Could you please provide more explanation on how to pretrain PKG? #6

Open huoxingdawang opened 11 months ago

huoxingdawang commented 11 months ago

Hi, thank you for your very interesting work and sharing your code! I wondering if you could provide more information on how to build the PKG? I saw that you explained the detail of how to generate PKG in the appendix of your paper, but I can't find the relevant code in this repo. In the paper, you mentioned that generating PKG requires the use of the S3D model. As far as I know, S3D is a very large model that requires the use of GPUs, but you also said in the readme that "The preprocessing stage does not require GPUs ." This really confused me. Can you provide more information?

hongluzhou commented 11 months ago

Thank you for your interest! Please refer to https://github.com/salesforce/paprika/blob/cbefd714f3368733b1dc4dc3f2ee1e2ba69f57ed/datasets/build_knowledge/build_knowledge.py#L4 for the code to build the PKG. Specifically:

The subsequent functions prefixed with 'pseudolabel*' pertain to extracting different types of pseudo labels based on the constructed PKG.

To save computation time and avoid using GPUs during the PKG construction process, we utilize the S3D model to extract features in advance and save these features on the disk. Instructions for feature extraction using S3D can be found here: https://github.com/salesforce/paprika#feature-extraction.