Closed skymanaditya1 closed 2 years ago
Hi! This is not expected and is very strange. Could you tell how does your dataset structure look like? Are you running from a zip archive or a directory? And could you tell which git hash are you currently using (today, I've updated it to e2f9580)
Ah, ok, I think that I understand the issue. It looks like all your videos are too short (less than 128 frames) and that's why the dataset class discards them when computing fvd2048_128f
, which is FVD computed on top 128-frames-long videos. What you should is not using fvd2048_128f
and fvd2048_128f_subsample8f
metrics. You can do this by removing them from the list here.
Oh yes, that was so silly of me! I had been suspecting that I didn't have enough frames in my videos (which I wrote on the other thread as well). Thank you for the quick reply and resolution. Marking the issue closed.
I get the following IOError -- 'No videos found in the specified archive' when the metric evaluation step is called. The 0th tick runs just fine, after which the error is thrown.
It's coming at
result_dict = metric_main.calc_metric(
line in training_loop.py. From what I have seen, it is failing to initialize the VideoFramesFolderDataset properly, which is throwing the error - 'No videos found in the specified archive'.Here is the stack trace --
Evaluating metrics for how2sign_faces_styleganv_resized_stylegan-v_random3_max32_how2sign_exp_styleganv_resized-e2f9580 ... {"results": {"fvd2048_16f": 5903.660982385764}, "metric": "fvd2048_16f", "total_time": 51.93783378601074, "total_time_str": "52s", "num_gpus": 4, "snapshot_pkl": "network-snapshot-000000.pkl", "timestamp": 1652628736.9207375} Traceback (most recent call last): File "/ssd_scratch/cvit/ravi/stylegan-v/experiments/how2sign_faces_styleganv_resized_stylegan-v_random3_max32_how2sign_exp_styleganv_resized-e2f9580/src/train.py", line 451, in
main() # pylint: disable=no-value-for-parameter
File "/ssd_scratch/cvit/ravi/stylegan-v/experiments/how2sign_faces_styleganv_resized_stylegan-v_random3_max32_how2sign_exp_styleganv_resized-e2f9580/src/train.py", line 446, in main
torch.multiprocessing.spawn(fn=subprocess_fn, args=(args, temp_dir), nprocs=args.num_gpus)
File "/ssd_scratch/cvit/ravi/stylegan-v/env/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/ssd_scratch/cvit/ravi/stylegan-v/env/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/ssd_scratch/cvit/ravi/stylegan-v/env/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 2 terminated with the following error: Traceback (most recent call last): File "/ssd_scratch/cvit/ravi/stylegan-v/env/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/ssd_scratch/cvit/ravi/stylegan-v/experiments/how2sign_faces_styleganv_resized_stylegan-v_random3_max32_how2sign_exp_styleganv_resized-e2f9580/src/train.py", line 375, in subprocess_fn training_loop.training_loop(rank=rank, args) File "/ssd_scratch/cvit/ravi/stylegan-v/experiments/how2sign_faces_styleganv_resized_stylegan-v_random3_max32_how2sign_exp_styleganv_resized-e2f9580/src/training/training_loop.py", line 508, in training_loop result_dict = metric_main.calc_metric( File "/home2/ravi_mishra/nps/stylegan-v/src/metrics/metric_main.py", line 49, in calc_metric all_runs_results = [_metricdictmetric for in range(num_runs)] File "/home2/ravi_mishra/nps/stylegan-v/src/metrics/metric_main.py", line 49, in
all_runs_results = [_metricdictmetric for in range(num_runs)]
File "/home2/ravi_mishra/nps/stylegan-v/src/metrics/metric_main.py", line 123, in fvd2048_128f
fvd = frechet_video_distance.compute_fvd(opts, max_real=2048, num_gen=2048, num_frames=128)
File "/home2/ravi_mishra/nps/stylegan-v/src/metrics/frechet_video_distance.py", line 31, in compute_fvd
mu_real, sigma_real = metric_utils.compute_feature_stats_for_dataset(
File "/ssd_scratch/cvit/ravi/stylegan-v/env/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(args, kwargs)
File "/home2/ravi_mishra/nps/stylegan-v/src/metrics/metric_utils.py", line 195, in compute_feature_stats_for_dataset
dataset = dnnlib.util.construct_class_by_name(dataset_kwargs)
File "/home2/ravi_mishra/nps/stylegan-v/src/dnnlib/util.py", line 292, in construct_class_by_name
return call_func_by_name(args, func_name=class_name, kwargs)
File "/home2/ravi_mishra/nps/stylegan-v/src/dnnlib/util.py", line 287, in call_func_by_name
return func_obj(*args, **kwargs)
File "/ssd_scratch/cvit/ravi/stylegan-v/experiments/how2sign_faces_styleganv_resized_stylegan-v_random3_max32_how2sign_exp_styleganv_resized-e2f9580/src/training/dataset.py", line 329, in init
raise IOError('No videos found in the specified archive')
OSError: No videos found in the specified archive
For now, I have commented the offending lines in training_loop.py and it has started to train. I commented these lines --
Is this expected or am I doing something wrong?