Clarification needed on discrepancy between Figure 3 of the paper and the actual dataset clip durations.

Thank you for sharing the code and data! If I understand Figure 3 (from Section 3.2) correctly, it shows that there are over 50k clips with a duration longer than 180 seconds. However, when I checked 'miradata_v1_330k.csv', it seems there are only 35k clips exceeding 180 seconds. I'm confused by the discrepancy. Am I misunderstanding Figure 3?

df = pd.read_csv('miradata_v1_330k.csv', encoding='utf-8')
print(len(df))
# 330313 will be printed

filtered_df = df[df['seconds'] > 180]
print(len(filtered_df))
# 35548 will be printed

mira-space / MiraData

Clarification needed on discrepancy between Figure 3 of the paper and the actual dataset clip durations. #13