Closed msra-jqxu closed 11 months ago
Hi @msra-jqxu,
Thank You for your interest in our work. Some of the video files were corrupted in our case and it could be the reason why the clip feature files are missing, and the reason of the mismatch. You can try skipping these videos as we did in our experiments.
The filtering script we use is attached below for your reference. Note that it takes an additional command line argument (i.e. --clip_feature_path
). Let me know if it solves the issue or if you have any further questions. Thank You.
import os
import json
import argparse
def parse_args():
parser = argparse.ArgumentParser(description="Training")
parser.add_argument("--input_json_file", required=True,
help="Path to input json file (i.e. VideoInstruct_Dataset.json)")
parser.add_argument("--output_json_file", required=True,
help="Path to output json file (i.e. VideoInstruct_Dataset_Train.json)")
parser.add_argument("--clip_feature_path", required=False, default="",
help="Path to generated CLIP feature paths to filter any missing video ids (optional).")
args = parser.parse_args()
return args
def main():
args = parse_args()
input_json_file = args.input_json_file
output_json_file = args.output_json_file
clip_feature_path = args.clip_feature_path
clip_features_files_witout_extension = ""
if clip_feature_path:
clip_features_files = os.listdir(clip_feature_path)
clip_features_files_witout_extension = []
for file in clip_features_files:
clip_features_files_witout_extension.append(file.split('.')[0])
input_json_contents = json.load(open(input_json_file, 'r'))
output_json_contents = []
for i, content in enumerate(input_json_contents):
valid = False
if not clip_feature_path:
valid = True
elif content['video_id'] in clip_features_files_witout_extension:
valid = True
if valid:
output_content = {'id': content['video_id'], 'video': f"{content['video_id']}.pkl", 'conversations': []}
# This is critical
if i % 2 == 0:
output_content['conversations'].append({'from': 'human', 'value': f"{content['q']}\n<video>"})
else:
output_content['conversations'].append({'from': 'human', 'value': f"<video>\n{content['q']}"})
output_content['conversations'].append({'from': 'gpt', 'value': content['a']})
output_json_contents.append(output_content)
print(f"Total annotations retained: {len(output_json_contents)}")
with open(output_json_file, 'w') as f:
json.dump(output_json_contents, f)
if __name__ == "__main__":
main()
Hi, @mmaaz60 , The filtering script really works for me and now I can train the model successfully! Thanks very much!
By the way, I specify the parameter --model_name_or_path
Hi @msra-jqxu,
This is normal. Thank you.
Thanks again! I will close this issue as completed.
Hi, I found that there is "v_6Ke30NtYOC0.pkl" as training data in the file "video_chatgpt_training.json"(obtained from "scripts/convert_instruction_json_to_training_format.py"), but it is not in the downloaded the pre-computed spatiotemporal CLIP features(link). How can I fix this problem?
Thanks!