facebookresearch / Ego4d

Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset
https://ego4d-data.org/docs/
MIT License
357 stars 49 forks source link

associating videos with 3D scans? #210

Open Linusnie opened 1 year ago

Linusnie commented 1 year ago

Hi! I'm looking into using ego4D for pose/depth estimation and I've managed to download the 3D scans using ego4d --output_directory="~/ego4d_data" --dataset=3d_scans. But how do I find the IDs of the corresponding videos which were recorded in those locations?

zdwww commented 1 year ago

I have the exact same issue. All the uids with scans files can't be associated with any video :(

miguelmartin75 commented 1 year ago

Hi all,

You can read <download-dir>/v2/3d_scans/manifest.csv, and that will tell you the fb_physical_setting_id per file. Using <out-dir>/ego4d.json you can map these IDs to the videos. It does require a couple of steps, so I've provided some code:

import os
import json
import pandas as pd
from collections import defaultdict

data_dir = "./ego4d_data"  # TODO: CHANGEME
metadata = json.load(open(os.path.join(data_dir, "ego4d.json")))
scan_metadata = pd.read_csv(open(os.path.join(data_dir, "v2/3d_scans/manifest.csv")))
scan_dir = os.path.join(data_dir, "v2/3d_scans")
scans = [x for x in os.listdir(scan_dir) if x.endswith("tar")]

phys_id_to_name = {}
for x in metadata["physical_settings"]:
    phys_id_to_name[x["fb_physical_setting_id"]] = x["name"]

videos_by_physical_name = defaultdict(list)
for x in metadata["videos"]:
    videos_by_physical_name[x["physical_setting_name"]].append(x)

print("Number of videos per physical scenario")
for pname, vs in videos_by_physical_name.items():
    print(pname, len(vs))
print()

videos_per_scan = defaultdict(list)
for _, x in scan_metadata.iterrows():
    bn = os.path.basename(x.s3_path)
    if bn in scans:
        pass
    phys_id = x.fb_physical_setting_id
    assert phys_id in phys_id_to_name, f"{phys_id}"
    phys_name = phys_id_to_name[phys_id]
    videos_per_scan[bn] = videos_by_physical_name[phys_name]

print("Number of Videos Per 3D Scan:")
for bn, vids in videos_per_scan.items():
    print(bn, len(vids))

I understand this is not ideal and apologies for the delays on this issue & for lack of documentation. This appears to work for me without any isuses in mapping. Please let me know if there's any issues.

Here is the output of the script:

Number of videos per physical scenario
None 9214
milktea 7
Bike mechanic 85
Crafter 29
card 6
trail 7
Scooter mechanic 67
home 81
groupmeeting 9
Baker 20
restaurant 12
Mechanic 21
hotpot 5
tasession 6
photo 9
boardgame 4
Carpenter 13
gym 4
barbershop 5
forest 5
school 2

Number of Videos Per 3D Scan:
unict_Scooter_mechanic_31.tar 67
unict_Baker_32.tar 20
unict_Carpenter_33.tar 13
unict_Bike_mechanic_34.tar 85
nus_barbershop_35.tar 5
nus_boardgame_36.tar 4
nus_card_37.tar 6
nus_hotpot_38.tar 5
nus_restaurant_39.tar 12
nus_groupmeeting_40.tar 9
nus_gym_41.tar 4
nus_milktea_42.tar 7
nus_photo_43.tar 9
nus_tasession_44.tar 6
unict_Crafter_83.tar 29

@ebyrne is this there any easier way to associate the files?