Open CleyLyChen opened 9 months ago
Hi,
gt_mask is used for training the separation branch. Usually, we have two ways to set gt_masks: (1) binary mask, where the gt_masks represent whether certain Time-Frequency bins are dominated by one of the sounds or the mixture, referred to as gt_masks = (spec_clean > 0.5 * spec).float()
. The values should not be all one, unless you choose spec_clean
and spec
to be the same or the added sound is much smaller than the original one. Besides, you can try different loss terms for penalizing, such as binary loss or L1/L2 loss; (2) ratio mask is also commonly used, with the form of gt_masks = spec_clean / spec
. The same loss terms are applied.
Please check if you have mixed another sound at the beginning, if you don't need the separation branch, consider removing the loss term.
Hi, I removed the separation branch and loss term, and train the model on Epic-Kitchens-100, but its CiOU@0.2 is only 15.2, CiOU@0.3 is 7 and CiOU@0.4 is 1.8, i dont know why, can you share more details?
I uploaded a data filtering script, to help remove some silent video clips in the training set. As the training set is constructed using the action recognition benchmark, there are many silent videos, which yield a negative impact on the training process. We tried a simple way to remove some silent videos (please refer to the code), but there could be better ways to explore.
Also, early stopping might be a useful trick, as the model uses a pre-trained vision network as initialization. You can check results from early epochs/steps, to decide whether your model's training is on the right way.
Hi, I still have some questions during training.
Epoch: [1][0/409], Time: 9.45, Data: 5.52, lr_sound: 0.001, lr_frame: 0.0001, loss: 4.1758
Epoch: [1][20/409], Time: 3.75, Data: 0.32, lr_sound: 0.001, lr_frame: 0.0001, loss: 3.4631
Epoch: [1][40/409], Time: 2.91, Data: 0.19, lr_sound: 0.001, lr_frame: 0.0001, loss: 2.5754
Epoch: [1][60/409], Time: 2.83, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 2.0339
Epoch: [1][80/409], Time: 3.04, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.8445
Epoch: [1][100/409], Time: 3.16, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.6860
Epoch: [1][0/588], Time: 11.46, Data: 7.43, lr_sound: 0.001, lr_frame: 0.0001, loss: 4.1810
Epoch: [1][20/588], Time: 3.89, Data: 0.41, lr_sound: 0.001, lr_frame: 0.0001, loss: 3.4935
Epoch: [1][40/588], Time: 3.69, Data: 0.24, lr_sound: 0.001, lr_frame: 0.0001, loss: 2.5676
Epoch: [1][60/588], Time: 3.63, Data: 0.18, lr_sound: 0.001, lr_frame: 0.0001, loss: 2.2132
Epoch: [1][80/588], Time: 3.60, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.7622
Epoch: [1][100/588], Time: 3.57, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.6586
Epoch: [1][120/588], Time: 3.56, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.4217
Epoch: [1][140/588], Time: 3.55, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.4287
Epoch: [1][160/588], Time: 3.55, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.3390
Epoch: [1][180/588], Time: 3.55, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.1895
Epoch: [1][200/588], Time: 3.54, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.1199
Epoch: [1][220/588], Time: 3.53, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8974
Epoch: [1][240/588], Time: 3.53, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.0010
Epoch: [1][260/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.0835
Epoch: [1][280/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.1133
Epoch: [1][300/588], Time: 3.53, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9576
Epoch: [1][320/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8214
Epoch: [1][340/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9479
Epoch: [1][360/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8073
Epoch: [1][380/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7141
Epoch: [1][400/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8135
Epoch: [1][420/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7217
Epoch: [1][440/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6688
Epoch: [1][460/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6411
Epoch: [1][480/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6243
Epoch: [1][500/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6810
Epoch: [1][520/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5658
Epoch: [1][540/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7880
Epoch: [1][560/588], Time: 3.52, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5949
Epoch: [1][580/588], Time: 3.51, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7155
Epoch: [1][0/588], Time: 11.39, Data: 5.37, lr_sound: 0.001, lr_frame: 0.0001, loss: 4.1810
Epoch: [1][20/588], Time: 4.09, Data: 0.31, lr_sound: 0.001, lr_frame: 0.0001, loss: 3.4920
Epoch: [1][40/588], Time: 3.97, Data: 0.19, lr_sound: 0.001, lr_frame: 0.0001, loss: 2.5649
Epoch: [1][60/588], Time: 3.88, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 2.1807
Epoch: [1][80/588], Time: 3.83, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.7942
Epoch: [1][100/588], Time: 3.79, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.5984
Epoch: [1][120/588], Time: 3.81, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.5884
Epoch: [1][140/588], Time: 3.79, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.4185
Epoch: [1][160/588], Time: 3.78, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.2574
Epoch: [1][180/588], Time: 3.77, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.3440
Epoch: [1][200/588], Time: 3.77, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.2705
Epoch: [1][220/588], Time: 3.76, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.0430
Epoch: [1][240/588], Time: 3.75, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.1693
Epoch: [1][260/588], Time: 3.75, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9548
Epoch: [1][280/588], Time: 3.76, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8652
Epoch: [1][300/588], Time: 3.75, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9799
Epoch: [1][320/588], Time: 3.75, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9064
Epoch: [1][340/588], Time: 3.76, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9438
Epoch: [1][360/588], Time: 3.76, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8222
Epoch: [1][380/588], Time: 3.76, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7039
Epoch: [1][400/588], Time: 3.77, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6734
Epoch: [1][420/588], Time: 3.77, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7620
Epoch: [1][440/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6265
Epoch: [1][460/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7539
Epoch: [1][480/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6029
Epoch: [1][500/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6748
Epoch: [1][520/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5724
Epoch: [1][540/588], Time: 3.78, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5763
Epoch: [1][560/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5508
Epoch: [1][580/588], Time: 3.77, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6539
Epoch: [2][0/588], Time: 10.45, Data: 6.30, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5689
Epoch: [2][20/588], Time: 4.00, Data: 0.36, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5990
Epoch: [2][40/588], Time: 3.88, Data: 0.21, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5450
Epoch: [2][60/588], Time: 3.83, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5504
Epoch: [2][80/588], Time: 3.79, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5666
Epoch: [2][100/588], Time: 3.77, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6176
Epoch: [2][120/588], Time: 3.76, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6018
Epoch: [2][140/588], Time: 3.75, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5471
Epoch: [2][160/588], Time: 3.74, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5686
Epoch: [2][180/588], Time: 3.74, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6198
Epoch: [2][200/588], Time: 3.74, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.0275
Epoch: [2][220/588], Time: 3.74, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7355
Epoch: [2][240/588], Time: 3.74, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7617
Epoch: [2][260/588], Time: 3.73, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9432
Epoch: [2][280/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6631
Epoch: [2][300/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6630
Epoch: [2][320/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7054
Epoch: [2][340/588], Time: 3.72, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5094
Epoch: [2][360/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6511
Epoch: [2][380/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7991
Epoch: [2][400/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5536
Epoch: [2][420/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4725
Epoch: [2][440/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5614
Epoch: [2][460/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5916
Epoch: [2][480/588], Time: 3.73, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4753
Epoch: [2][500/588], Time: 3.73, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5111
Epoch: [2][520/588], Time: 3.74, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5404
Epoch: [2][540/588], Time: 3.74, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5138
Epoch: [2][560/588], Time: 3.74, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5061
Epoch: [2][580/588], Time: 3.74, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4305
Epoch: [3][0/588], Time: 12.32, Data: 7.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5470
Epoch: [3][20/588], Time: 4.15, Data: 0.40, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5444
Epoch: [3][40/588], Time: 3.92, Data: 0.24, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4655
Epoch: [3][60/588], Time: 3.83, Data: 0.18, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6225
Epoch: [3][80/588], Time: 3.36, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4974
Epoch: [3][100/588], Time: 3.08, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7689
Epoch: [3][120/588], Time: 3.17, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5072
Epoch: [3][140/588], Time: 3.25, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4875
Epoch: [3][160/588], Time: 3.30, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5146
Epoch: [3][180/588], Time: 3.34, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5556
Epoch: [3][200/588], Time: 3.36, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4513
Epoch: [3][220/588], Time: 3.39, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4138
Epoch: [3][240/588], Time: 3.42, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6241
Epoch: [3][260/588], Time: 3.43, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5607
Epoch: [3][280/588], Time: 3.46, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6141
Epoch: [3][300/588], Time: 3.47, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5482
Epoch: [3][320/588], Time: 3.48, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4780
Epoch: [3][340/588], Time: 3.50, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4840
Epoch: [3][360/588], Time: 3.51, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4358
Epoch: [3][380/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5462
Epoch: [3][400/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5667
Epoch: [3][420/588], Time: 3.53, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6208
Epoch: [3][440/588], Time: 3.53, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5835
Epoch: [3][460/588], Time: 3.54, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4604
Epoch: [3][480/588], Time: 3.55, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4659
Epoch: [3][500/588], Time: 3.55, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4573
Epoch: [3][520/588], Time: 3.55, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6221
Epoch: [3][540/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4710
Epoch: [3][560/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4532
Epoch: [3][580/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4506
Epoch: [4][0/588], Time: 10.55, Data: 6.33, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4392
Epoch: [4][20/588], Time: 3.96, Data: 0.36, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4195
Epoch: [4][40/588], Time: 3.80, Data: 0.22, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4902
Epoch: [1][0/588], Time: 13.00, Data: 7.71, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4899
Epoch: [1][20/588], Time: 4.06, Data: 0.43, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5651
Epoch: [1][40/588], Time: 3.80, Data: 0.25, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5583
Epoch: [1][60/588], Time: 3.71, Data: 0.19, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4681
Epoch: [1][80/588], Time: 3.68, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4563
Epoch: [1][100/588], Time: 3.67, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4984
Epoch: [1][120/588], Time: 3.65, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5317
Epoch: [1][140/588], Time: 3.64, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5651
Epoch: [1][160/588], Time: 3.62, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4764
Epoch: [1][180/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6429
Epoch: [1][200/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5813
Epoch: [1][220/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4645
Epoch: [1][240/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5310
Epoch: [1][260/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4372
Epoch: [1][280/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4687
Epoch: [1][300/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4925
Epoch: [1][320/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4976
Epoch: [1][340/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4829
Epoch: [1][360/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5029
Epoch: [1][380/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4449
Epoch: [1][400/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5103
Epoch: [1][420/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5083
Epoch: [1][440/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5028
Epoch: [1][460/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4942
Epoch: [1][480/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5026
Epoch: [1][500/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4861
Epoch: [1][520/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3890
Epoch: [1][540/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6259
Epoch: [1][560/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4007
Epoch: [1][580/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4087
Epoch: [2][0/588], Time: 10.37, Data: 6.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3901
Epoch: [2][20/588], Time: 3.90, Data: 0.35, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5464
Epoch: [2][40/588], Time: 3.75, Data: 0.21, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4392
Epoch: [2][60/588], Time: 3.70, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4178
Epoch: [2][80/588], Time: 3.68, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4418
Epoch: [2][100/588], Time: 3.65, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4693
Epoch: [2][120/588], Time: 3.63, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4793
Epoch: [2][140/588], Time: 3.63, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5529
Epoch: [2][160/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4476
Epoch: [2][180/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5183
Epoch: [2][200/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3991
Epoch: [2][220/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5588
Epoch: [2][240/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5231
Epoch: [2][260/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4619
Epoch: [2][280/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4306
Epoch: [2][300/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4238
Epoch: [2][320/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4297
Epoch: [2][340/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4767
Epoch: [2][360/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3697
Epoch: [2][380/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6044
Epoch: [2][400/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4399
Epoch: [2][420/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4914
Epoch: [2][440/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4851
Epoch: [2][460/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5587
Epoch: [2][480/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5352
Epoch: [2][500/588], Time: 3.52, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4812
Epoch: [2][520/588], Time: 3.48, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5102
Epoch: [2][540/588], Time: 3.48, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4389
Epoch: [2][560/588], Time: 3.49, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5080
Epoch: [2][580/588], Time: 3.49, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5662
Epoch: [3][0/588], Time: 11.09, Data: 5.81, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9355
Epoch: [3][20/588], Time: 4.02, Data: 0.34, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5677
Epoch: [3][40/588], Time: 3.82, Data: 0.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7611
Epoch: [3][60/588], Time: 3.73, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7395
Epoch: [3][80/588], Time: 3.70, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7692
Epoch: [3][100/588], Time: 3.68, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5468
Epoch: [3][120/588], Time: 3.66, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5703
Epoch: [3][140/588], Time: 3.66, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7253
Epoch: [3][160/588], Time: 3.64, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6027
Epoch: [3][180/588], Time: 3.64, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5449
Epoch: [3][200/588], Time: 3.63, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6291
Epoch: [3][220/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4519
Epoch: [3][240/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6092
Epoch: [3][260/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5218
Epoch: [3][280/588], Time: 3.62, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4292
Epoch: [3][300/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5298
Epoch: [3][320/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4253
Epoch: [3][340/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5528
Epoch: [3][360/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5641
Epoch: [3][380/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5031
Epoch: [3][400/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4240
Epoch: [3][420/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6612
Epoch: [3][440/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5816
Epoch: [3][460/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5122
Epoch: [3][480/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4329
Epoch: [3][500/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5064
Epoch: [3][520/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5358
Epoch: [3][540/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5695
Epoch: [3][560/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4869
Epoch: [3][580/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3943
Epoch: [4][0/588], Time: 11.69, Data: 7.50, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5344
Epoch: [4][20/588], Time: 4.11, Data: 0.47, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4860
Epoch: [4][40/588], Time: 3.85, Data: 0.27, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4284
Epoch: [4][60/588], Time: 3.77, Data: 0.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4747
Epoch: [4][80/588], Time: 3.72, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6278
Epoch: [4][100/588], Time: 3.69, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4910
Epoch: [4][120/588], Time: 3.69, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4600
Epoch: [4][140/588], Time: 3.70, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9107
Epoch: [4][160/588], Time: 3.68, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6278
Epoch: [4][180/588], Time: 3.67, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6042
Epoch: [4][200/588], Time: 3.66, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5486
Epoch: [4][220/588], Time: 3.66, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4943
Epoch: [4][240/588], Time: 3.65, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4674
Epoch: [4][260/588], Time: 3.64, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4376
Epoch: [4][280/588], Time: 3.64, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5288
Epoch: [4][300/588], Time: 3.63, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5231
Epoch: [4][320/588], Time: 3.63, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8455
Epoch: [4][340/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.9007
Epoch: [4][360/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 1.3321
Epoch: [4][380/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7563
Epoch: [4][400/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7966
Epoch: [4][420/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5455
Epoch: [4][440/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7893
Epoch: [4][460/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8344
Epoch: [4][480/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6783
Epoch: [4][500/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6623
Epoch: [4][520/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5764
Epoch: [4][540/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4946
Epoch: [4][560/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7310
Epoch: [4][580/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5261
Epoch: [5][0/588], Time: 12.67, Data: 8.52, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5352
Epoch: [5][20/588], Time: 3.98, Data: 0.47, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5590
Epoch: [5][40/588], Time: 3.83, Data: 0.27, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4754
Epoch: [5][60/588], Time: 3.77, Data: 0.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7409
Epoch: [5][80/588], Time: 3.72, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6370
Epoch: [5][100/588], Time: 3.68, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7033
Epoch: [5][120/588], Time: 3.65, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7528
Epoch: [5][140/588], Time: 3.64, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5645
Epoch: [5][160/588], Time: 3.63, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5184
Epoch: [5][180/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4757
Epoch: [5][200/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4939
Epoch: [5][220/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5752
Epoch: [5][240/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6898
Epoch: [5][260/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7349
Epoch: [5][280/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5455
Epoch: [5][300/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3914
Epoch: [5][320/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4838
Epoch: [5][340/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5296
Epoch: [5][360/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4718
Epoch: [5][380/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4364
Epoch: [5][400/588], Time: 3.50, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4866
Epoch: [5][420/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4586
Epoch: [5][440/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4475
Epoch: [5][460/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4107
Epoch: [5][480/588], Time: 3.47, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4221
Epoch: [5][500/588], Time: 3.47, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4310
Epoch: [5][520/588], Time: 3.48, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4239
Epoch: [5][540/588], Time: 3.48, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4278
Epoch: [5][560/588], Time: 3.48, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3951
Epoch: [5][580/588], Time: 3.48, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4610
Epoch: [6][0/588], Time: 11.59, Data: 6.73, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5101
Epoch: [6][20/588], Time: 3.92, Data: 0.38, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3706
Epoch: [6][40/588], Time: 3.75, Data: 0.23, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4121
Epoch: [6][60/588], Time: 3.68, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4006
Epoch: [6][80/588], Time: 3.65, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4331
Epoch: [6][100/588], Time: 3.63, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3715
Epoch: [6][120/588], Time: 3.67, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4309
Epoch: [6][140/588], Time: 3.65, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4765
Epoch: [6][160/588], Time: 3.64, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4571
Epoch: [6][180/588], Time: 3.63, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4732
Epoch: [6][200/588], Time: 3.63, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4529
Epoch: [6][220/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4397
Epoch: [6][240/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4313
Epoch: [6][260/588], Time: 3.62, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4684
Epoch: [6][280/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4346
Epoch: [6][300/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4327
Epoch: [6][320/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3990
Epoch: [6][340/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4235
Epoch: [6][360/588], Time: 3.62, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6646
Epoch: [6][380/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4443
Epoch: [6][400/588], Time: 3.65, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4655
Epoch: [6][420/588], Time: 3.66, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4172
Epoch: [6][440/588], Time: 3.66, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4796
Epoch: [6][460/588], Time: 3.67, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4421
Epoch: [6][480/588], Time: 3.67, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4276
Epoch: [6][500/588], Time: 3.69, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5559
Epoch: [6][520/588], Time: 3.69, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4342
Epoch: [6][540/588], Time: 3.69, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4225
Epoch: [6][560/588], Time: 3.70, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4076
Epoch: [6][580/588], Time: 3.70, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4301
Epoch: [7][0/588], Time: 11.20, Data: 6.86, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4106
Epoch: [7][20/588], Time: 3.93, Data: 0.39, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3630
Epoch: [7][40/588], Time: 3.75, Data: 0.23, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5152
Epoch: [7][60/588], Time: 3.69, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4418
Epoch: [7][80/588], Time: 3.66, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4756
Epoch: [7][100/588], Time: 3.64, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4041
Epoch: [7][120/588], Time: 3.63, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4156
Epoch: [7][140/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5060
Epoch: [7][160/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4319
Epoch: [7][180/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3917
Epoch: [7][200/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5006
Epoch: [7][220/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4999
Epoch: [7][240/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4131
Epoch: [7][260/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4138
Epoch: [7][280/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4626
Epoch: [7][300/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4632
Epoch: [7][320/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4477
Epoch: [7][340/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4730
Epoch: [7][360/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4873
Epoch: [7][380/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5394
Epoch: [7][400/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4647
Epoch: [7][420/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6635
Epoch: [7][440/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6654
Epoch: [7][460/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4503
Epoch: [7][480/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3930
Epoch: [7][500/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4378
Epoch: [7][520/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4890
Epoch: [7][540/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4594
Epoch: [7][560/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3856
Epoch: [7][580/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3852
Epoch: [8][0/588], Time: 11.22, Data: 6.87, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3853
Epoch: [8][20/588], Time: 3.87, Data: 0.39, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4320
Epoch: [8][40/588], Time: 3.73, Data: 0.23, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3722
Epoch: [8][60/588], Time: 3.67, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4052
Epoch: [8][80/588], Time: 3.65, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4073
Epoch: [8][100/588], Time: 3.63, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5277
Epoch: [8][120/588], Time: 3.62, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5635
Epoch: [8][140/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4964
Epoch: [8][160/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4690
Epoch: [8][180/588], Time: 3.59, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4031
Epoch: [8][200/588], Time: 3.59, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5540
Epoch: [8][220/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4569
Epoch: [8][240/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4940
Epoch: [8][260/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4009
Epoch: [8][280/588], Time: 3.47, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4326
Epoch: [8][300/588], Time: 3.37, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4601
Epoch: [8][320/588], Time: 3.39, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8279
Epoch: [8][340/588], Time: 3.40, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4314
Epoch: [8][360/588], Time: 3.41, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3750
Epoch: [8][380/588], Time: 3.42, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3666
Epoch: [8][400/588], Time: 3.43, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4125
Epoch: [8][420/588], Time: 3.44, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5684
Epoch: [8][440/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3692
Epoch: [8][460/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3926
Epoch: [8][480/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3957
Epoch: [8][500/588], Time: 3.47, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4023
Epoch: [8][520/588], Time: 3.47, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3889
Epoch: [8][540/588], Time: 3.48, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3953
Epoch: [8][560/588], Time: 3.49, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3985
Epoch: [8][580/588], Time: 3.49, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3741
Epoch: [9][0/588], Time: 10.07, Data: 5.99, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4050
Epoch: [9][20/588], Time: 3.91, Data: 0.34, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4040
Epoch: [9][40/588], Time: 3.73, Data: 0.21, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5323
Epoch: [9][60/588], Time: 3.68, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3872
Epoch: [9][80/588], Time: 3.65, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4716
Epoch: [9][100/588], Time: 3.63, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4690
Epoch: [9][120/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4394
Epoch: [9][140/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4139
Epoch: [9][160/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6878
Epoch: [9][180/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3903
Epoch: [9][200/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4059
Epoch: [9][220/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5559
Epoch: [9][240/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4897
Epoch: [9][260/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4441
Epoch: [9][280/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4418
Epoch: [9][300/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3837
Epoch: [9][320/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4273
Epoch: [9][340/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4181
Epoch: [9][360/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3575
Epoch: [9][380/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4261
Epoch: [9][400/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3854
Epoch: [9][420/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3686
Epoch: [9][440/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4349
Epoch: [9][460/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4734
Epoch: [9][480/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4291
Epoch: [9][500/588], Time: 3.59, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4492
Epoch: [9][520/588], Time: 3.59, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4654
Epoch: [9][540/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5038
Epoch: [9][560/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7294
Epoch: [9][580/588], Time: 3.60, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5328
Epoch: [10][0/588], Time: 11.46, Data: 6.93, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5151
Epoch: [10][20/588], Time: 3.90, Data: 0.39, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4711
Epoch: [10][40/588], Time: 3.73, Data: 0.23, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4196
Epoch: [10][60/588], Time: 3.65, Data: 0.18, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4256
Epoch: [10][80/588], Time: 3.63, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3736
Epoch: [10][100/588], Time: 3.60, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4482
Epoch: [10][120/588], Time: 3.59, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4045
Epoch: [10][140/588], Time: 3.58, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4529
Epoch: [10][160/588], Time: 3.58, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4945
Epoch: [10][180/588], Time: 3.57, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5045
Epoch: [10][200/588], Time: 3.56, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4887
Epoch: [10][220/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4134
Epoch: [10][240/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3941
Epoch: [10][260/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4449
Epoch: [10][280/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4079
Epoch: [10][300/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4288
Epoch: [10][320/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3969
Epoch: [10][340/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4863
Epoch: [10][360/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4354
Epoch: [10][380/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3979
Epoch: [10][400/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3692
Epoch: [10][420/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3650
Epoch: [10][440/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4141
Epoch: [10][460/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4216
Epoch: [10][480/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4018
Epoch: [10][500/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4596
Epoch: [10][520/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3915
Epoch: [10][540/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4634
Epoch: [10][560/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4291
Epoch: [10][580/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4550
Epoch: [11][0/588], Time: 11.03, Data: 6.77, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4259
Epoch: [11][20/588], Time: 3.93, Data: 0.38, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4890
Epoch: [11][40/588], Time: 3.73, Data: 0.23, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4168
Epoch: [11][60/588], Time: 3.66, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4504
Epoch: [11][80/588], Time: 3.63, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8193
Epoch: [11][100/588], Time: 3.62, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6823
Epoch: [11][120/588], Time: 3.60, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7105
Epoch: [11][140/588], Time: 3.59, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7664
Epoch: [11][160/588], Time: 3.58, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7094
Epoch: [11][180/588], Time: 3.43, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5034
Epoch: [11][200/588], Time: 3.29, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4624
Epoch: [11][220/588], Time: 3.31, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4042
Epoch: [11][240/588], Time: 3.34, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4923
Epoch: [11][260/588], Time: 3.37, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4804
Epoch: [11][280/588], Time: 3.39, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5170
Epoch: [11][300/588], Time: 3.40, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5953
Epoch: [11][320/588], Time: 3.40, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4636
Epoch: [11][340/588], Time: 3.41, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4575
Epoch: [11][360/588], Time: 3.42, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4535
Epoch: [11][380/588], Time: 3.42, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4343
Epoch: [11][400/588], Time: 3.43, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4597
Epoch: [11][420/588], Time: 3.43, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5107
Epoch: [11][440/588], Time: 3.44, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5243
Epoch: [11][460/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5246
Epoch: [11][480/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4542
Epoch: [11][500/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4658
Epoch: [11][520/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4083
Epoch: [11][540/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4304
Epoch: [11][560/588], Time: 3.46, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4246
Epoch: [11][580/588], Time: 3.46, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4761
Epoch: [12][0/588], Time: 12.45, Data: 8.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4787
Epoch: [12][20/588], Time: 3.99, Data: 0.45, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5639
Epoch: [12][40/588], Time: 3.78, Data: 0.26, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5103
Epoch: [12][60/588], Time: 3.72, Data: 0.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4421
Epoch: [12][80/588], Time: 3.67, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5372
Epoch: [12][100/588], Time: 3.65, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4111
Epoch: [12][120/588], Time: 3.64, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3956
Epoch: [12][140/588], Time: 3.63, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4257
Epoch: [12][160/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5525
Epoch: [12][180/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4195
Epoch: [12][200/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4871
Epoch: [12][220/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4442
Epoch: [12][240/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4616
Epoch: [12][260/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3710
Epoch: [12][280/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3933
Epoch: [12][300/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5779
Epoch: [12][320/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4477
Epoch: [12][340/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4365
Epoch: [12][360/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4012
Epoch: [12][380/588], Time: 3.62, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4233
Epoch: [12][400/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4949
Epoch: [12][420/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4560
Epoch: [12][440/588], Time: 3.64, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4273
Epoch: [12][460/588], Time: 3.64, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3979
Epoch: [12][480/588], Time: 3.65, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5277
Epoch: [12][500/588], Time: 3.65, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3372
Epoch: [12][520/588], Time: 3.66, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4502
Epoch: [12][540/588], Time: 3.66, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5195
Epoch: [12][560/588], Time: 3.66, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4047
Epoch: [12][580/588], Time: 3.67, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4309
Epoch: [13][0/588], Time: 12.70, Data: 8.76, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4487
Epoch: [13][20/588], Time: 4.00, Data: 0.48, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5673
Epoch: [13][40/588], Time: 3.78, Data: 0.27, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3961
Epoch: [13][60/588], Time: 3.71, Data: 0.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4121
Epoch: [13][80/588], Time: 3.67, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3774
Epoch: [13][100/588], Time: 3.65, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4860
Epoch: [13][120/588], Time: 3.63, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3864
Epoch: [13][140/588], Time: 3.62, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3846
Epoch: [13][160/588], Time: 3.62, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3744
Epoch: [13][180/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5601
Epoch: [13][200/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3918
Epoch: [13][220/588], Time: 3.61, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3789
Epoch: [13][240/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4206
Epoch: [13][260/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3457
Epoch: [13][280/588], Time: 3.63, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3727
Epoch: [13][300/588], Time: 3.64, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3871
Epoch: [13][320/588], Time: 3.65, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3638
Epoch: [13][340/588], Time: 3.66, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6451
Epoch: [13][360/588], Time: 3.66, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4421
Epoch: [13][380/588], Time: 3.67, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4236
Epoch: [13][400/588], Time: 3.68, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4055
Epoch: [13][420/588], Time: 3.68, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4085
Epoch: [13][440/588], Time: 3.69, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4245
Epoch: [13][460/588], Time: 3.69, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4899
Epoch: [13][480/588], Time: 3.70, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6591
Epoch: [13][500/588], Time: 3.71, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8647
Epoch: [13][520/588], Time: 3.71, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5140
Epoch: [13][540/588], Time: 3.71, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4103
Epoch: [13][560/588], Time: 3.72, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4341
Epoch: [13][580/588], Time: 3.72, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4560
Epoch: [14][0/588], Time: 10.88, Data: 6.25, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4225
Epoch: [14][20/588], Time: 3.86, Data: 0.36, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4947
Epoch: [14][40/588], Time: 3.16, Data: 0.21, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3784
Epoch: [14][60/588], Time: 2.84, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3654
Epoch: [14][80/588], Time: 2.95, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4092
Epoch: [14][100/588], Time: 3.07, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4060
Epoch: [14][120/588], Time: 3.15, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4331
Epoch: [14][140/588], Time: 3.21, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3924
Epoch: [14][160/588], Time: 3.26, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3616
Epoch: [14][180/588], Time: 3.28, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3454
Epoch: [14][200/588], Time: 3.32, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4320
Epoch: [14][220/588], Time: 3.34, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4455
Epoch: [14][240/588], Time: 3.36, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4234
Epoch: [14][260/588], Time: 3.37, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5656
Epoch: [14][280/588], Time: 3.39, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4093
Epoch: [14][300/588], Time: 3.40, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4176
Epoch: [14][320/588], Time: 3.41, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4920
Epoch: [14][340/588], Time: 3.42, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4483
Epoch: [14][360/588], Time: 3.43, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4472
Epoch: [14][380/588], Time: 3.44, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4007
Epoch: [14][400/588], Time: 3.44, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4546
Epoch: [14][420/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3748
Epoch: [14][440/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4257
Epoch: [14][460/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3861
Epoch: [14][480/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4747
Epoch: [14][500/588], Time: 3.46, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4566
Epoch: [14][520/588], Time: 3.47, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4999
Epoch: [14][540/588], Time: 3.47, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4948
Epoch: [14][560/588], Time: 3.47, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4841
Epoch: [14][580/588], Time: 3.47, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6920
Epoch: [15][0/588], Time: 10.02, Data: 5.42, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4328
Epoch: [15][20/588], Time: 3.91, Data: 0.32, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4610
Epoch: [15][40/588], Time: 3.74, Data: 0.19, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4271
Epoch: [15][60/588], Time: 3.66, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5275
Epoch: [15][80/588], Time: 3.63, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5233
Epoch: [15][100/588], Time: 3.62, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5339
Epoch: [15][120/588], Time: 3.60, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4525
Epoch: [15][140/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3801
Epoch: [15][160/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4345
Epoch: [15][180/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4424
Epoch: [15][200/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3943
Epoch: [15][220/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4788
Epoch: [15][240/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4361
Epoch: [15][260/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4261
Epoch: [15][280/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3900
Epoch: [15][300/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3889
Epoch: [15][320/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4862
Epoch: [15][340/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4467
Epoch: [15][360/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3880
Epoch: [15][380/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4057
Epoch: [15][400/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3854
Epoch: [15][420/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4939
Epoch: [15][440/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4149
Epoch: [15][460/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4335
Epoch: [15][480/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4548
Epoch: [15][500/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4816
Epoch: [15][520/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4577
Epoch: [15][540/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3465
Epoch: [15][560/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3807
Epoch: [15][580/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4127
Epoch: [16][0/588], Time: 10.62, Data: 6.32, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5604
Epoch: [16][20/588], Time: 3.92, Data: 0.36, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4247
Epoch: [16][40/588], Time: 3.74, Data: 0.22, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3837
Epoch: [16][60/588], Time: 3.71, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3732
Epoch: [16][80/588], Time: 3.70, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4259
Epoch: [16][100/588], Time: 3.68, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4514
Epoch: [16][120/588], Time: 3.66, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4274
Epoch: [16][140/588], Time: 3.64, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4113
Epoch: [16][160/588], Time: 3.63, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3748
Epoch: [16][180/588], Time: 3.62, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4117
Epoch: [16][200/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3695
Epoch: [16][220/588], Time: 3.61, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4227
Epoch: [16][240/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6275
Epoch: [16][260/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5642
Epoch: [16][280/588], Time: 3.60, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4409
Epoch: [16][300/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4046
Epoch: [16][320/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3976
Epoch: [16][340/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4038
Epoch: [16][360/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3935
Epoch: [16][380/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4028
Epoch: [16][400/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4029
Epoch: [16][420/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4154
Epoch: [16][440/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4854
Epoch: [16][460/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5318
Epoch: [16][480/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4268
Epoch: [16][500/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.8587
Epoch: [16][520/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5265
Epoch: [16][540/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6019
Epoch: [16][560/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5523
Epoch: [16][580/588], Time: 3.57, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4825
Epoch: [17][0/588], Time: 11.03, Data: 6.49, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4139
Epoch: [17][20/588], Time: 3.89, Data: 0.37, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4928
Epoch: [17][40/588], Time: 3.70, Data: 0.22, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7734
Epoch: [17][60/588], Time: 3.64, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5619
Epoch: [17][80/588], Time: 3.64, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5103
Epoch: [17][100/588], Time: 3.63, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5089
Epoch: [17][120/588], Time: 3.62, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5789
Epoch: [17][140/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5893
Epoch: [17][160/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4950
Epoch: [17][180/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6452
Epoch: [17][200/588], Time: 3.59, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6228
Epoch: [17][220/588], Time: 3.59, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4331
Epoch: [17][240/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6075
Epoch: [17][260/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4189
Epoch: [17][280/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4356
Epoch: [17][300/588], Time: 3.58, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4342
Epoch: [17][320/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5713
Epoch: [17][340/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5539
Epoch: [17][360/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4789
Epoch: [17][380/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4685
Epoch: [17][400/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6079
Epoch: [17][420/588], Time: 3.62, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5526
Epoch: [17][440/588], Time: 3.62, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5495
Epoch: [17][460/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4334
Epoch: [17][480/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4313
Epoch: [17][500/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4872
Epoch: [17][520/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4112
Epoch: [17][540/588], Time: 3.63, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5037
Epoch: [17][560/588], Time: 3.64, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4349
Epoch: [17][580/588], Time: 3.64, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4941
Epoch: [18][0/588], Time: 11.15, Data: 6.80, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4646
Epoch: [18][20/588], Time: 3.89, Data: 0.38, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3777
Epoch: [18][40/588], Time: 3.72, Data: 0.23, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4777
Epoch: [18][60/588], Time: 3.66, Data: 0.17, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5297
Epoch: [18][80/588], Time: 3.63, Data: 0.15, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3274
Epoch: [18][100/588], Time: 3.61, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4226
Epoch: [18][120/588], Time: 3.60, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4254
Epoch: [18][140/588], Time: 3.60, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4224
Epoch: [18][160/588], Time: 3.59, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4967
Epoch: [18][180/588], Time: 3.58, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3812
Epoch: [18][200/588], Time: 3.58, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6078
Epoch: [18][220/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4773
Epoch: [18][240/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4831
Epoch: [18][260/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7905
Epoch: [18][280/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7383
Epoch: [18][300/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3883
Epoch: [18][320/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4754
Epoch: [18][340/588], Time: 3.59, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5128
Epoch: [18][360/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4743
Epoch: [18][380/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4361
Epoch: [18][400/588], Time: 3.60, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6699
Epoch: [18][420/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4068
Epoch: [18][440/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5126
Epoch: [18][460/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4245
Epoch: [18][480/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3979
Epoch: [18][500/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3890
Epoch: [18][520/588], Time: 3.61, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4154
Epoch: [18][540/588], Time: 3.62, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4733
Epoch: [18][560/588], Time: 3.62, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4525
Epoch: [18][580/588], Time: 3.62, Data: 0.07, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4519
Epoch: [19][0/588], Time: 12.19, Data: 8.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3925
Epoch: [19][20/588], Time: 3.99, Data: 0.45, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4685
Epoch: [19][40/588], Time: 3.78, Data: 0.26, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4351
Epoch: [19][60/588], Time: 3.69, Data: 0.20, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3416
Epoch: [19][80/588], Time: 3.66, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5108
Epoch: [19][100/588], Time: 3.63, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4229
Epoch: [19][120/588], Time: 3.62, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3877
Epoch: [19][140/588], Time: 3.61, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3598
Epoch: [19][160/588], Time: 3.60, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4814
Epoch: [19][180/588], Time: 3.59, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4619
Epoch: [19][200/588], Time: 3.58, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3868
Epoch: [19][220/588], Time: 3.57, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4044
Epoch: [19][240/588], Time: 3.57, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3833
Epoch: [19][260/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3783
Epoch: [19][280/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3659
Epoch: [19][300/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4625
Epoch: [19][320/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.7082
Epoch: [19][340/588], Time: 3.56, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3851
Epoch: [19][360/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3693
Epoch: [19][380/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4211
Epoch: [19][400/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4087
Epoch: [19][420/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4090
Epoch: [19][440/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3891
Epoch: [19][460/588], Time: 3.54, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3603
Epoch: [19][480/588], Time: 3.49, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4058
Epoch: [19][500/588], Time: 3.44, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6536
Epoch: [19][520/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3765
Epoch: [19][540/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4769
Epoch: [19][560/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5043
Epoch: [19][580/588], Time: 3.45, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6017
Epoch: [20][0/588], Time: 12.26, Data: 8.02, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4340
Epoch: [20][20/588], Time: 3.95, Data: 0.44, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4242
Epoch: [20][40/588], Time: 3.74, Data: 0.26, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4823
Epoch: [20][60/588], Time: 3.70, Data: 0.19, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3965
Epoch: [20][80/588], Time: 3.69, Data: 0.16, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4990
Epoch: [20][100/588], Time: 3.66, Data: 0.14, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4699
Epoch: [20][120/588], Time: 3.64, Data: 0.13, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4020
Epoch: [20][140/588], Time: 3.63, Data: 0.12, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4194
Epoch: [20][160/588], Time: 3.62, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3963
Epoch: [20][180/588], Time: 3.61, Data: 0.11, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4272
Epoch: [20][200/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4077
Epoch: [20][220/588], Time: 3.60, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4344
Epoch: [20][240/588], Time: 3.59, Data: 0.10, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3984
Epoch: [20][260/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4088
Epoch: [20][280/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4674
Epoch: [20][300/588], Time: 3.58, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4783
Epoch: [20][320/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4137
Epoch: [20][340/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3774
Epoch: [20][360/588], Time: 3.57, Data: 0.09, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4219
Epoch: [20][380/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3723
Epoch: [20][400/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4044
Epoch: [20][420/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5698
Epoch: [20][440/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4055
Epoch: [20][460/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4041
Epoch: [20][480/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4108
Epoch: [20][500/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.3880
Epoch: [20][520/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4289
Epoch: [20][540/588], Time: 3.56, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.4421
Epoch: [20][560/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.5656
Epoch: [20][580/588], Time: 3.57, Data: 0.08, lr_sound: 0.001, lr_frame: 0.0001, loss: 0.6244
Epoch: [21][0/588], Time: 10.36, Data: 5.85, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6479
Epoch: [21][20/588], Time: 3.86, Data: 0.34, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3788
Epoch: [21][40/588], Time: 3.72, Data: 0.20, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6105
Epoch: [21][60/588], Time: 3.65, Data: 0.16, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3737
Epoch: [21][80/588], Time: 3.61, Data: 0.13, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3905
Epoch: [21][100/588], Time: 3.60, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3670
Epoch: [21][120/588], Time: 3.59, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4420
Epoch: [21][140/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5857
Epoch: [21][160/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3873
Epoch: [21][180/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6601
Epoch: [21][200/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5735
Epoch: [21][220/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3563
Epoch: [21][240/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5797
Epoch: [21][260/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6502
Epoch: [21][280/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5655
Epoch: [21][300/588], Time: 3.56, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6812
Epoch: [21][320/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3992
Epoch: [21][340/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5721
Epoch: [21][360/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6239
Epoch: [21][380/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3453
Epoch: [21][400/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5431
Epoch: [21][420/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5803
Epoch: [21][440/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3768
Epoch: [21][460/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3649
Epoch: [21][480/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5527
Epoch: [21][500/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6654
Epoch: [21][520/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4141
Epoch: [21][540/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3871
Epoch: [21][560/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4290
Epoch: [21][580/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6501
Epoch: [22][0/588], Time: 11.04, Data: 6.67, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3636
Epoch: [22][20/588], Time: 3.88, Data: 0.38, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3656
Epoch: [22][40/588], Time: 3.70, Data: 0.22, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6042
Epoch: [22][60/588], Time: 3.64, Data: 0.17, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5815
Epoch: [22][80/588], Time: 3.62, Data: 0.14, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5601
Epoch: [22][100/588], Time: 3.61, Data: 0.13, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6574
Epoch: [22][120/588], Time: 3.60, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3657
Epoch: [22][140/588], Time: 3.59, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.7409
Epoch: [22][160/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6868
Epoch: [22][180/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5028
Epoch: [22][200/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5668
Epoch: [22][220/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6817
Epoch: [22][240/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5935
Epoch: [22][260/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6134
Epoch: [22][280/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6647
Epoch: [22][300/588], Time: 3.56, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3430
Epoch: [22][320/588], Time: 3.56, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3984
Epoch: [22][340/588], Time: 3.56, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4256
Epoch: [22][360/588], Time: 3.56, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5670
Epoch: [22][380/588], Time: 3.53, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4902
Epoch: [22][400/588], Time: 3.45, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6103
Epoch: [22][420/588], Time: 3.43, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3598
Epoch: [22][440/588], Time: 3.43, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6382
Epoch: [22][460/588], Time: 3.44, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6018
Epoch: [22][480/588], Time: 3.44, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4739
Epoch: [22][500/588], Time: 3.45, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6225
Epoch: [22][520/588], Time: 3.45, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4441
Epoch: [22][540/588], Time: 3.46, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3756
Epoch: [22][560/588], Time: 3.47, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3753
Epoch: [22][580/588], Time: 3.48, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3608
Epoch: [23][0/588], Time: 10.69, Data: 6.21, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3798
Epoch: [23][20/588], Time: 3.88, Data: 0.36, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3771
Epoch: [23][40/588], Time: 3.70, Data: 0.21, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3322
Epoch: [23][60/588], Time: 3.66, Data: 0.16, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3418
Epoch: [23][80/588], Time: 3.63, Data: 0.14, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3474
Epoch: [23][100/588], Time: 3.61, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5040
Epoch: [23][120/588], Time: 3.60, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3326
Epoch: [23][140/588], Time: 3.59, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5006
Epoch: [23][160/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3584
Epoch: [23][180/588], Time: 3.60, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3664
Epoch: [23][200/588], Time: 3.62, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3691
Epoch: [23][220/588], Time: 3.61, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3462
Epoch: [23][240/588], Time: 3.61, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3640
Epoch: [23][260/588], Time: 3.60, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3930
Epoch: [23][280/588], Time: 3.60, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4322
Epoch: [23][300/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3862
Epoch: [23][320/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3674
Epoch: [23][340/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5060
Epoch: [23][360/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3910
Epoch: [23][380/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3941
Epoch: [23][400/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4842
Epoch: [23][420/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3500
Epoch: [23][440/588], Time: 3.57, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3723
Epoch: [23][460/588], Time: 3.57, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3568
Epoch: [23][480/588], Time: 3.57, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4217
Epoch: [23][500/588], Time: 3.57, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4121
Epoch: [23][520/588], Time: 3.57, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3518
Epoch: [23][540/588], Time: 3.58, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4068
Epoch: [23][560/588], Time: 3.57, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3579
Epoch: [23][580/588], Time: 3.57, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4058
Epoch: [24][0/588], Time: 10.19, Data: 5.94, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4136
Epoch: [24][20/588], Time: 3.92, Data: 0.34, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4972
Epoch: [24][40/588], Time: 3.74, Data: 0.21, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3708
Epoch: [24][60/588], Time: 3.70, Data: 0.16, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3785
Epoch: [24][80/588], Time: 3.65, Data: 0.14, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6398
Epoch: [24][100/588], Time: 3.64, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4020
Epoch: [24][120/588], Time: 3.62, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3408
Epoch: [24][140/588], Time: 3.62, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4688
Epoch: [24][160/588], Time: 3.60, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4631
Epoch: [24][180/588], Time: 3.60, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3632
Epoch: [24][200/588], Time: 3.59, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3690
Epoch: [24][220/588], Time: 3.59, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3802
Epoch: [24][240/588], Time: 3.59, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4363
Epoch: [24][260/588], Time: 3.58, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4099
Epoch: [24][280/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4547
Epoch: [24][300/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4786
Epoch: [24][320/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3652
Epoch: [24][340/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4372
Epoch: [24][360/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3610
Epoch: [24][380/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3973
Epoch: [24][400/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3871
Epoch: [24][420/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3565
Epoch: [24][440/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3952
Epoch: [24][460/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3745
Epoch: [24][480/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3893
Epoch: [24][500/588], Time: 3.58, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3428
Epoch: [24][520/588], Time: 3.58, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3428
Epoch: [24][540/588], Time: 3.58, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3541
Epoch: [24][560/588], Time: 3.57, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4277
Epoch: [24][580/588], Time: 3.57, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3831
Epoch: [25][0/588], Time: 10.09, Data: 5.98, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4383
Epoch: [25][20/588], Time: 3.87, Data: 0.34, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4794
Epoch: [25][40/588], Time: 3.70, Data: 0.21, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4150
Epoch: [25][60/588], Time: 3.64, Data: 0.16, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4479
Epoch: [25][80/588], Time: 3.61, Data: 0.14, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4144
Epoch: [25][100/588], Time: 3.60, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4442
Epoch: [25][120/588], Time: 3.59, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4647
Epoch: [25][140/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3903
Epoch: [25][160/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3634
Epoch: [25][180/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4136
Epoch: [25][200/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3806
Epoch: [25][220/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3701
Epoch: [25][240/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4035
Epoch: [25][260/588], Time: 3.56, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4287
Epoch: [25][280/588], Time: 3.50, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3839
Epoch: [25][300/588], Time: 3.41, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3443
Epoch: [25][320/588], Time: 3.37, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4006
Epoch: [25][340/588], Time: 3.38, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3616
Epoch: [25][360/588], Time: 3.39, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4512
Epoch: [25][380/588], Time: 3.39, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3408
Epoch: [25][400/588], Time: 3.40, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3678
Epoch: [25][420/588], Time: 3.40, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3283
Epoch: [25][440/588], Time: 3.42, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3819
Epoch: [25][460/588], Time: 3.43, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3440
Epoch: [25][480/588], Time: 3.44, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4043
Epoch: [25][500/588], Time: 3.44, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3698
Epoch: [25][520/588], Time: 3.44, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4062
Epoch: [25][540/588], Time: 3.45, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3218
Epoch: [25][560/588], Time: 3.45, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3266
Epoch: [25][580/588], Time: 3.45, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3361
Epoch: [26][0/588], Time: 10.03, Data: 5.21, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3407
Epoch: [26][20/588], Time: 3.89, Data: 0.31, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3975
Epoch: [26][40/588], Time: 3.74, Data: 0.19, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4088
Epoch: [26][60/588], Time: 3.67, Data: 0.16, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4036
Epoch: [26][80/588], Time: 3.63, Data: 0.14, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3672
Epoch: [26][100/588], Time: 3.60, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3537
Epoch: [26][120/588], Time: 3.59, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4193
Epoch: [26][140/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4055
Epoch: [26][160/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5905
Epoch: [26][180/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3580
Epoch: [26][200/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3716
Epoch: [26][220/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3699
Epoch: [26][240/588], Time: 3.55, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3980
Epoch: [26][260/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3788
Epoch: [26][280/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3420
Epoch: [26][300/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3523
Epoch: [26][320/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4162
Epoch: [26][340/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3602
Epoch: [26][360/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4266
Epoch: [26][380/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3598
Epoch: [26][400/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3798
Epoch: [26][420/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3517
Epoch: [26][440/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3860
Epoch: [26][460/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3743
Epoch: [26][480/588], Time: 3.55, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4346
Epoch: [26][500/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3484
Epoch: [26][520/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3438
Epoch: [26][540/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4046
Epoch: [26][560/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3803
Epoch: [26][580/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4078
Epoch: [27][0/588], Time: 10.58, Data: 5.98, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3676
Epoch: [27][20/588], Time: 3.92, Data: 0.35, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3323
Epoch: [27][40/588], Time: 3.72, Data: 0.21, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3875
Epoch: [27][60/588], Time: 3.67, Data: 0.16, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4119
Epoch: [27][80/588], Time: 3.63, Data: 0.14, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3776
Epoch: [27][100/588], Time: 3.61, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4094
Epoch: [27][120/588], Time: 3.60, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4005
Epoch: [27][140/588], Time: 3.59, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4217
Epoch: [27][160/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4083
Epoch: [27][180/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3481
Epoch: [27][200/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3332
Epoch: [27][220/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3504
Epoch: [27][240/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3227
Epoch: [27][260/588], Time: 3.56, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3370
Epoch: [27][280/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3415
Epoch: [27][300/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4008
Epoch: [27][320/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3426
Epoch: [27][340/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4365
Epoch: [27][360/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3592
Epoch: [27][380/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5975
Epoch: [27][400/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3685
Epoch: [27][420/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3313
Epoch: [27][440/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4072
Epoch: [27][460/588], Time: 3.55, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3391
Epoch: [27][480/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3707
Epoch: [27][500/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3824
Epoch: [27][520/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3535
Epoch: [27][540/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4348
Epoch: [27][560/588], Time: 3.54, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3988
Epoch: [27][580/588], Time: 3.53, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3998
Epoch: [28][0/588], Time: 11.57, Data: 7.13, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3906
Epoch: [28][20/588], Time: 3.95, Data: 0.40, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3656
Epoch: [28][40/588], Time: 3.75, Data: 0.24, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3387
Epoch: [28][60/588], Time: 3.73, Data: 0.18, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3495
Epoch: [28][80/588], Time: 3.72, Data: 0.15, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3752
Epoch: [28][100/588], Time: 3.68, Data: 0.13, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3451
Epoch: [28][120/588], Time: 3.66, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3669
Epoch: [28][140/588], Time: 3.64, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3401
Epoch: [28][160/588], Time: 3.63, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3942
Epoch: [28][180/588], Time: 3.62, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3713
Epoch: [28][200/588], Time: 3.54, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3351
Epoch: [28][220/588], Time: 3.42, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3635
Epoch: [28][240/588], Time: 3.34, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4332
Epoch: [28][260/588], Time: 3.36, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6382
Epoch: [28][280/588], Time: 3.37, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3738
Epoch: [28][300/588], Time: 3.38, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3572
Epoch: [28][320/588], Time: 3.39, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3473
Epoch: [28][340/588], Time: 3.40, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3714
Epoch: [28][360/588], Time: 3.40, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3765
Epoch: [28][380/588], Time: 3.41, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4084
Epoch: [28][400/588], Time: 3.41, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3933
Epoch: [28][420/588], Time: 3.42, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3089
Epoch: [28][440/588], Time: 3.42, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3691
Epoch: [28][460/588], Time: 3.43, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6418
Epoch: [28][480/588], Time: 3.43, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3365
Epoch: [28][500/588], Time: 3.44, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3327
Epoch: [28][520/588], Time: 3.45, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3637
Epoch: [28][540/588], Time: 3.46, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3963
Epoch: [28][560/588], Time: 3.47, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5257
Epoch: [28][580/588], Time: 3.48, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4325
Epoch: [29][0/588], Time: 11.05, Data: 6.81, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4593
Epoch: [29][20/588], Time: 3.87, Data: 0.38, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3301
Epoch: [29][40/588], Time: 3.70, Data: 0.23, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3600
Epoch: [29][60/588], Time: 3.63, Data: 0.17, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3581
Epoch: [29][80/588], Time: 3.61, Data: 0.15, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3555
Epoch: [29][100/588], Time: 3.58, Data: 0.13, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3320
Epoch: [29][120/588], Time: 3.57, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3409
Epoch: [29][140/588], Time: 3.56, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3512
Epoch: [29][160/588], Time: 3.56, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4021
Epoch: [29][180/588], Time: 3.55, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5918
Epoch: [29][200/588], Time: 3.55, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3602
Epoch: [29][220/588], Time: 3.55, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3250
Epoch: [29][240/588], Time: 3.55, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4104
Epoch: [29][260/588], Time: 3.54, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6283
Epoch: [29][280/588], Time: 3.54, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3890
Epoch: [29][300/588], Time: 3.54, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4359
Epoch: [29][320/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3409
Epoch: [29][340/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4076
Epoch: [29][360/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3711
Epoch: [29][380/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3958
Epoch: [29][400/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3803
Epoch: [29][420/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3456
Epoch: [29][440/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3223
Epoch: [29][460/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3339
Epoch: [29][480/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4066
Epoch: [29][500/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3816
Epoch: [29][520/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3319
Epoch: [29][540/588], Time: 3.54, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4031
Epoch: [29][560/588], Time: 3.53, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3483
Epoch: [29][580/588], Time: 3.53, Data: 0.07, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3718
Epoch: [30][0/588], Time: 11.22, Data: 6.87, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3354
Epoch: [30][20/588], Time: 3.91, Data: 0.39, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3291
Epoch: [30][40/588], Time: 3.74, Data: 0.23, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3546
Epoch: [30][60/588], Time: 3.67, Data: 0.17, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3312
Epoch: [30][80/588], Time: 3.63, Data: 0.15, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3490
Epoch: [30][100/588], Time: 3.61, Data: 0.13, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3505
Epoch: [30][120/588], Time: 3.59, Data: 0.12, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3487
Epoch: [30][140/588], Time: 3.59, Data: 0.11, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4053
Epoch: [30][160/588], Time: 3.58, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3274
Epoch: [30][180/588], Time: 3.57, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3829
Epoch: [30][200/588], Time: 3.56, Data: 0.10, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4055
Epoch: [30][220/588], Time: 3.58, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3392
Epoch: [30][240/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.6289
Epoch: [30][260/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3726
Epoch: [30][280/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3422
Epoch: [30][300/588], Time: 3.57, Data: 0.09, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3404
Epoch: [30][320/588], Time: 3.57, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3824
Epoch: [30][340/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3670
Epoch: [30][360/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4178
Epoch: [30][380/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3650
Epoch: [30][400/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4549
Epoch: [30][420/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5066
Epoch: [30][440/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3211
Epoch: [30][460/588], Time: 3.58, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3403
Epoch: [30][480/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3939
Epoch: [30][500/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4540
Epoch: [30][520/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3804
Epoch: [30][540/588], Time: 3.59, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.5974
Epoch: [30][560/588], Time: 3.60, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.4370
Epoch: [30][580/588], Time: 3.60, Data: 0.08, lr_sound: 0.0001, lr_frame: 1e-05, loss: 0.3399
Epoch: [31][0/588], Time: 10.71, Data: 6.42, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4458
Epoch: [31][20/588], Time: 3.92, Data: 0.37, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3643
Epoch: [31][40/588], Time: 3.73, Data: 0.22, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3355
Epoch: [31][60/588], Time: 3.67, Data: 0.17, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3333
Epoch: [31][80/588], Time: 3.64, Data: 0.14, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3082
Epoch: [31][100/588], Time: 3.62, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3776
Epoch: [31][120/588], Time: 3.42, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3210
Epoch: [31][140/588], Time: 3.24, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3190
Epoch: [31][160/588], Time: 3.24, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3640
Epoch: [31][180/588], Time: 3.27, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3858
Epoch: [31][200/588], Time: 3.30, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3236
Epoch: [31][220/588], Time: 3.33, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3156
Epoch: [31][240/588], Time: 3.37, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3671
Epoch: [31][260/588], Time: 3.39, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3622
Epoch: [31][280/588], Time: 3.41, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4114
Epoch: [31][300/588], Time: 3.42, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3549
Epoch: [31][320/588], Time: 3.44, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4356
Epoch: [31][340/588], Time: 3.45, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3631
Epoch: [31][360/588], Time: 3.46, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4178
Epoch: [31][380/588], Time: 3.46, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3956
Epoch: [31][400/588], Time: 3.47, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3890
Epoch: [31][420/588], Time: 3.47, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3523
Epoch: [31][440/588], Time: 3.47, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3463
Epoch: [31][460/588], Time: 3.48, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4847
Epoch: [31][480/588], Time: 3.48, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4539
Epoch: [31][500/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3850
Epoch: [31][520/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3459
Epoch: [31][540/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3478
Epoch: [31][560/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3074
Epoch: [31][580/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3404
Epoch: [32][0/588], Time: 10.70, Data: 6.14, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3334
Epoch: [32][20/588], Time: 3.92, Data: 0.36, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.5270
Epoch: [32][40/588], Time: 3.76, Data: 0.21, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3258
Epoch: [32][60/588], Time: 3.70, Data: 0.16, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3965
Epoch: [32][80/588], Time: 3.65, Data: 0.14, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3870
Epoch: [32][100/588], Time: 3.62, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3409
Epoch: [32][120/588], Time: 3.61, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3277
Epoch: [32][140/588], Time: 3.61, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3998
Epoch: [32][160/588], Time: 3.60, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3716
Epoch: [32][180/588], Time: 3.60, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3506
Epoch: [32][200/588], Time: 3.59, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3291
Epoch: [32][220/588], Time: 3.59, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3135
Epoch: [32][240/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3560
Epoch: [32][260/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3896
Epoch: [32][280/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3423
Epoch: [32][300/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3481
Epoch: [32][320/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3496
Epoch: [32][340/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3705
Epoch: [32][360/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3185
Epoch: [32][380/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3714
Epoch: [32][400/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3340
Epoch: [32][420/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3655
Epoch: [32][440/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3621
Epoch: [32][460/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.2996
Epoch: [32][480/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3565
Epoch: [32][500/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3735
Epoch: [32][520/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3286
Epoch: [32][540/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3190
Epoch: [32][560/588], Time: 3.57, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3942
Epoch: [32][580/588], Time: 3.57, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3202
Epoch: [33][0/588], Time: 12.27, Data: 8.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3580
Epoch: [33][20/588], Time: 4.03, Data: 0.45, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3605
Epoch: [33][40/588], Time: 3.80, Data: 0.26, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3770
Epoch: [33][60/588], Time: 3.72, Data: 0.20, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3572
Epoch: [33][80/588], Time: 3.67, Data: 0.17, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3470
Epoch: [33][100/588], Time: 3.64, Data: 0.15, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6715
Epoch: [33][120/588], Time: 3.63, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3759
Epoch: [33][140/588], Time: 3.61, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4220
Epoch: [33][160/588], Time: 3.62, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3710
Epoch: [33][180/588], Time: 3.61, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4108
Epoch: [33][200/588], Time: 3.60, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3815
Epoch: [33][220/588], Time: 3.59, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3612
Epoch: [33][240/588], Time: 3.59, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3679
Epoch: [33][260/588], Time: 3.59, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3792
Epoch: [33][280/588], Time: 3.59, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3153
Epoch: [33][300/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.5187
Epoch: [33][320/588], Time: 3.59, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4615
Epoch: [33][340/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3520
Epoch: [33][360/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3609
Epoch: [33][380/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3296
Epoch: [33][400/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3831
Epoch: [33][420/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3143
Epoch: [33][440/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3553
Epoch: [33][460/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4489
Epoch: [33][480/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3263
Epoch: [33][500/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3154
Epoch: [33][520/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3727
Epoch: [33][540/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3832
Epoch: [33][560/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3905
Epoch: [33][580/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3305
Epoch: [34][0/588], Time: 11.20, Data: 6.84, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3398
Epoch: [34][20/588], Time: 3.36, Data: 0.39, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3621
Epoch: [34][40/588], Time: 2.66, Data: 0.23, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3437
Epoch: [34][60/588], Time: 2.62, Data: 0.18, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3828
Epoch: [34][80/588], Time: 2.89, Data: 0.15, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3262
Epoch: [34][100/588], Time: 3.05, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3721
Epoch: [34][120/588], Time: 3.15, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3466
Epoch: [34][140/588], Time: 3.23, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3633
Epoch: [34][160/588], Time: 3.28, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3416
Epoch: [34][180/588], Time: 3.32, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3251
Epoch: [34][200/588], Time: 3.36, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3617
Epoch: [34][220/588], Time: 3.39, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4142
Epoch: [34][240/588], Time: 3.41, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3937
Epoch: [34][260/588], Time: 3.42, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3582
Epoch: [34][280/588], Time: 3.44, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3384
Epoch: [34][300/588], Time: 3.45, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4356
Epoch: [34][320/588], Time: 3.46, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3542
Epoch: [34][340/588], Time: 3.47, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3339
Epoch: [34][360/588], Time: 3.48, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3346
Epoch: [34][380/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3600
Epoch: [34][400/588], Time: 3.50, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3118
Epoch: [34][420/588], Time: 3.50, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3470
Epoch: [34][440/588], Time: 3.51, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3397
Epoch: [34][460/588], Time: 3.52, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3387
Epoch: [34][480/588], Time: 3.52, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3524
Epoch: [34][500/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4512
Epoch: [34][520/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3266
Epoch: [34][540/588], Time: 3.54, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3696
Epoch: [34][560/588], Time: 3.54, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3651
Epoch: [34][580/588], Time: 3.54, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3262
Epoch: [35][0/588], Time: 12.53, Data: 8.45, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3351
Epoch: [35][20/588], Time: 3.96, Data: 0.46, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3743
Epoch: [35][40/588], Time: 3.77, Data: 0.27, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4320
Epoch: [35][60/588], Time: 3.69, Data: 0.20, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4278
Epoch: [35][80/588], Time: 3.65, Data: 0.17, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3385
Epoch: [35][100/588], Time: 3.64, Data: 0.15, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3342
Epoch: [35][120/588], Time: 3.63, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3556
Epoch: [35][140/588], Time: 3.64, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3002
Epoch: [35][160/588], Time: 3.64, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3477
Epoch: [35][180/588], Time: 3.65, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3940
Epoch: [35][200/588], Time: 3.67, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4446
Epoch: [35][220/588], Time: 3.66, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3434
Epoch: [35][240/588], Time: 3.66, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3520
Epoch: [35][260/588], Time: 3.65, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4343
Epoch: [35][280/588], Time: 3.64, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3722
Epoch: [35][300/588], Time: 3.64, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4176
Epoch: [35][320/588], Time: 3.63, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3487
Epoch: [35][340/588], Time: 3.63, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3253
Epoch: [35][360/588], Time: 3.64, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4084
Epoch: [35][380/588], Time: 3.64, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3325
Epoch: [35][400/588], Time: 3.65, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3186
Epoch: [35][420/588], Time: 3.65, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4150
Epoch: [35][440/588], Time: 3.66, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3139
Epoch: [35][460/588], Time: 3.66, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3583
Epoch: [35][480/588], Time: 3.66, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3238
Epoch: [35][500/588], Time: 3.66, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3295
Epoch: [35][520/588], Time: 3.66, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3415
Epoch: [35][540/588], Time: 3.67, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3263
Epoch: [35][560/588], Time: 3.67, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3509
Epoch: [35][580/588], Time: 3.67, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3366
Epoch: [36][0/588], Time: 10.77, Data: 6.42, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3352
Epoch: [36][20/588], Time: 3.94, Data: 0.37, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3314
Epoch: [36][40/588], Time: 3.77, Data: 0.22, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3402
Epoch: [36][60/588], Time: 3.69, Data: 0.17, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3341
Epoch: [36][80/588], Time: 3.66, Data: 0.14, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4877
Epoch: [36][100/588], Time: 3.63, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3673
Epoch: [36][120/588], Time: 3.62, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3727
Epoch: [36][140/588], Time: 3.61, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3624
Epoch: [36][160/588], Time: 3.60, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3266
Epoch: [36][180/588], Time: 3.60, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3816
Epoch: [36][200/588], Time: 3.60, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3768
Epoch: [36][220/588], Time: 3.60, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3788
Epoch: [36][240/588], Time: 3.59, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3190
Epoch: [36][260/588], Time: 3.59, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4657
Epoch: [36][280/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3776
Epoch: [36][300/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3735
Epoch: [36][320/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.5781
Epoch: [36][340/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3493
Epoch: [36][360/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3581
Epoch: [36][380/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3131
Epoch: [36][400/588], Time: 3.59, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3763
Epoch: [36][420/588], Time: 3.59, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4228
Epoch: [36][440/588], Time: 3.59, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3265
Epoch: [36][460/588], Time: 3.60, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3011
Epoch: [36][480/588], Time: 3.61, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3431
Epoch: [36][500/588], Time: 3.61, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3653
Epoch: [36][520/588], Time: 3.62, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3498
Epoch: [36][540/588], Time: 3.58, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3986
Epoch: [36][560/588], Time: 3.53, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3949
Epoch: [36][580/588], Time: 3.52, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3926
Epoch: [37][0/588], Time: 11.21, Data: 7.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3325
Epoch: [37][20/588], Time: 3.91, Data: 0.40, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3141
Epoch: [37][40/588], Time: 3.72, Data: 0.24, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3676
Epoch: [37][60/588], Time: 3.67, Data: 0.18, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3666
Epoch: [37][80/588], Time: 3.64, Data: 0.15, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3070
Epoch: [37][100/588], Time: 3.62, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4426
Epoch: [37][120/588], Time: 3.61, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3601
Epoch: [37][140/588], Time: 3.59, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3347
Epoch: [37][160/588], Time: 3.58, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3554
Epoch: [37][180/588], Time: 3.58, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3161
Epoch: [37][200/588], Time: 3.58, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3679
Epoch: [37][220/588], Time: 3.58, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6818
Epoch: [37][240/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4143
Epoch: [37][260/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3414
Epoch: [37][280/588], Time: 3.56, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3222
Epoch: [37][300/588], Time: 3.56, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3428
Epoch: [37][320/588], Time: 3.56, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3892
Epoch: [37][340/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4390
Epoch: [37][360/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3207
Epoch: [37][380/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3385
Epoch: [37][400/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3980
Epoch: [37][420/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3076
Epoch: [37][440/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3518
Epoch: [37][460/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4140
Epoch: [37][480/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3549
Epoch: [37][500/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3890
Epoch: [37][520/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4008
Epoch: [37][540/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3388
Epoch: [37][560/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3563
Epoch: [37][580/588], Time: 3.57, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3432
Epoch: [38][0/588], Time: 10.96, Data: 6.66, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3733
Epoch: [38][20/588], Time: 3.91, Data: 0.38, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3046
Epoch: [38][40/588], Time: 3.74, Data: 0.23, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3502
Epoch: [38][60/588], Time: 3.68, Data: 0.17, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4164
Epoch: [38][80/588], Time: 3.65, Data: 0.15, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3567
Epoch: [38][100/588], Time: 3.63, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3608
Epoch: [38][120/588], Time: 3.61, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3295
Epoch: [38][140/588], Time: 3.61, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3155
Epoch: [38][160/588], Time: 3.60, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3477
Epoch: [38][180/588], Time: 3.59, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3236
Epoch: [38][200/588], Time: 3.58, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3473
Epoch: [38][220/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4402
Epoch: [38][240/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3466
Epoch: [38][260/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3302
Epoch: [38][280/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3508
Epoch: [38][300/588], Time: 3.56, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.5815
Epoch: [38][320/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3931
Epoch: [38][340/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4879
Epoch: [38][360/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3511
Epoch: [38][380/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4940
Epoch: [38][400/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3069
Epoch: [38][420/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4348
Epoch: [38][440/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3259
Epoch: [38][460/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6314
Epoch: [38][480/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3181
Epoch: [38][500/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4068
Epoch: [38][520/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3383
Epoch: [38][540/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3203
Epoch: [38][560/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3694
Epoch: [38][580/588], Time: 3.55, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3409
Epoch: [39][0/588], Time: 11.27, Data: 6.90, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3762
Epoch: [39][20/588], Time: 3.94, Data: 0.39, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3435
Epoch: [39][40/588], Time: 3.73, Data: 0.23, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3324
Epoch: [39][60/588], Time: 3.66, Data: 0.18, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3914
Epoch: [39][80/588], Time: 3.63, Data: 0.15, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6171
Epoch: [39][100/588], Time: 3.62, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4659
Epoch: [39][120/588], Time: 3.61, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3257
Epoch: [39][140/588], Time: 3.60, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3264
Epoch: [39][160/588], Time: 3.59, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3570
Epoch: [39][180/588], Time: 3.59, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3244
Epoch: [39][200/588], Time: 3.58, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3323
Epoch: [39][220/588], Time: 3.58, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3436
Epoch: [39][240/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4165
Epoch: [39][260/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3451
Epoch: [39][280/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3121
Epoch: [39][300/588], Time: 3.57, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3341
Epoch: [39][320/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3775
Epoch: [39][340/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3177
Epoch: [39][360/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3599
Epoch: [39][380/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3516
Epoch: [39][400/588], Time: 3.56, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3775
Epoch: [39][420/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3648
Epoch: [39][440/588], Time: 3.55, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3435
Epoch: [39][460/588], Time: 3.49, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3251
Epoch: [39][480/588], Time: 3.43, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3373
Epoch: [39][500/588], Time: 3.44, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6936
Epoch: [39][520/588], Time: 3.45, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3529
Epoch: [39][540/588], Time: 3.46, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4356
Epoch: [39][560/588], Time: 3.47, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6890
Epoch: [39][580/588], Time: 3.47, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3572
Epoch: [40][0/588], Time: 10.93, Data: 6.57, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.6221
Epoch: [40][20/588], Time: 3.84, Data: 0.37, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4456
Epoch: [40][40/588], Time: 3.70, Data: 0.22, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3676
Epoch: [40][60/588], Time: 3.63, Data: 0.17, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3674
Epoch: [40][80/588], Time: 3.58, Data: 0.14, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3922
Epoch: [40][100/588], Time: 3.57, Data: 0.13, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3712
Epoch: [40][120/588], Time: 3.56, Data: 0.12, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3604
Epoch: [40][140/588], Time: 3.55, Data: 0.11, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3782
Epoch: [40][160/588], Time: 3.55, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3133
Epoch: [40][180/588], Time: 3.54, Data: 0.10, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3467
Epoch: [40][200/588], Time: 3.53, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3651
Epoch: [40][220/588], Time: 3.54, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4619
Epoch: [40][240/588], Time: 3.53, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3811
Epoch: [40][260/588], Time: 3.53, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.7480
Epoch: [40][280/588], Time: 3.53, Data: 0.09, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.7117
Epoch: [40][300/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3294
Epoch: [40][320/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3547
Epoch: [40][340/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4467
Epoch: [40][360/588], Time: 3.52, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3892
Epoch: [40][380/588], Time: 3.52, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3190
Epoch: [40][400/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3575
Epoch: [40][420/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3312
Epoch: [40][440/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3201
Epoch: [40][460/588], Time: 3.53, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4096
Epoch: [40][480/588], Time: 3.52, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3710
Epoch: [40][500/588], Time: 3.52, Data: 0.08, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.4106
Epoch: [40][520/588], Time: 3.52, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3141
Epoch: [40][540/588], Time: 3.52, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3097
Epoch: [40][560/588], Time: 3.52, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3800
Epoch: [40][580/588], Time: 3.52, Data: 0.07, lr_sound: 1e-05, lr_frame: 1.0000000000000002e-06, loss: 0.3195
Hi! Great work! Hello! I am learning about your work, but I cannot achieve the performance mentioned in your paper. I noticed that the gt_mask in NetWrapper is always a tensor with all elements set to 1. Could you provide more training details and your model checkpoint?