egoschema / EgoSchema

70 stars 0 forks source link

EgoSchema Dataset Download Repository

EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding

Karttikeya Mangalam, Raiymbek Akshulakov, Jitendra Malik

Berkeley AI Research, UC Berkeley

:globe_with_meridians: Webpage | :book: Paper | :movie_camera: Teaser Video | :microphone: 4-min Podcast | :speaking_head: [Overview Talk Video]() | :bar_chart: Statistics Dashboard| :crossed_swords: Kaggle


EgoSchema Video

Click for the youtube teaser video

:dizzy: Dataset Highlights

:exclamation: EgoSchema is 10x to 100x more difficult longer temporal reasoning than almost all other video datasets**.

:exclamation: Largest OSS Video-Language models with 7B+ parameters achieve QA accuracy of <33% (Random choice is 20%). Humans achieve ~76%.

:exclamation: Even web-scaled trained closed source models with 100B+ parameters achieve <40% accuracy, highlighting the massive latent gap in model capabilities for long-term video understanding.

**please see paper for precise operationalizations.

:star: Downloading the Dataset

Option A (Download via Kaggle):

The most optimal way to download the dataset at the current moment.

  1. Visit Kaggle Public API guide and install and authenticate Kaggle CLI
  2. Visit the Egoschema competition page, read and accept the rules in order to download data or make submissions.
  3. After that you can download the videos folder by running the following command: kaggle competitions download -c egoschema-public

Option B (Download via Wasabi):

Fast. Supports Resuming over spotty internet.

  1. Download uid_to_url.json file from EgoSchema Google Drive

This file will be updated weekly as the links expire every 7 days. You will receive a warning if your file becomes outdated.

  1. Run the following:
conda create -n egoschema_download python=3.8 
conda activate egoschema_download
conda install tqdm simplejson requests
pip install moviepy
mkdir videos
python download.py

Note 1: This will retrieve the dataset and store it in the videos directory. Video names correspond to the q_uid key in the questions.json file. If any files encounter issues during installation, rerun the download.py script. If problems persist, links to the necessary files will be provided via Google Drive.

Note 2: If above is too slow, run python download_multiproc.py --p <number of processes> instead for multi-processed downloading. Please be aware that this method might hit the rate limit under heavy load on wasabi servers. In that case please revert to download.py

Option C: Direct Download (Download zip from Google Drive):

Simpler. Requires stable internet connection.

  1. Directly download the zipped file from the EgoSchema Google Drive.

Benchmarking on EgoSchema

1. Benchmarking New models:

While we release all the video and questions from EgoSchema, we release the correct answers to only 500 of the EgoSchema questions provided in the subset_answers.json file intended for offline experimentation and performance tracking.

:loudspeaker: EgoSchema is intended for a 0-shot evaluation benchmark, hence the entire correct answer file will not be make public. To evaluate on the entire benchmark please submit the correct answer estimate as follows:

Option A: Public Kaggle leaderboard. The primary means of submitting the results.

required arguments: -f FILE_NAME, --file FILE_NAME File for upload (full path) -m MESSAGE, --message MESSAGE Message describing this submission

optional arguments: -h, --help show this help message and exit competition Competition URL suffix (use "kaggle competitions list" to show options) If empty, the default competition will be used (use "kaggle config set competition")" -q, --quiet Suppress printing information about the upload/download progress


 **Option B (using our provided wrapper):** 
No leaderboard, just a submission validation.

- **Step 1**: Prepare a JSON file that contains a dictionary structured as `{ <question uid> :<correct answer>` where `correct_answer : int[0 - 4]`.  
- **Step 2**:  Run `python validate.py --f <path_to_json_file>` to send the request to EgoSchema server,

**Option C (directly using CURL):** 
No leaderboard, just a submission validation.

- `curl -X POST -H "Content-Type: application/json" -d @<path_to_json_file> https://validation-server.onrender.com/api/upload/`

**Returned Payload** will contain the Multiple-Choice Question-Answer accuracy in the following text format: 

MCQ Accuracy for All of 5031 EgoSchema Questions MCQ Accuracy for Publicly eleased 500 EgoSchema Answers



:fireworks: *Coming Soon* : A public leaderboard of submitted model rankings on EgoSchema.  
### 2. Reproducing model results from paper : 
Please see `benchmarking/` for detailed description of each model separately.