Open LokiXun opened 1 year ago
Dear @LokiXun, have you solved this problem? The loss function increase at about 40k iterations.
Is the loss increase from the DCN layer in the training
Yes, the simple solution is resuming from a non-crashed checkpoint.
stayhungry1 @.***> 于2024年4月9日周二 15:06写道:
Is the loss increase from the DCN layer in the training
— Reply to this email directly, view it on GitHub https://github.com/MCG-NKU/E2FGVI/issues/75#issuecomment-2044285166, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFATMT4W4KBMUT7UF2AORM3Y4OHNZAVCNFSM6AAAAAA4LMAUZ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBUGI4DKMJWGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi, it an awesome work! May I ask some help, I met some problems when training the model on REDS video dataset. When the training elapses about 40K iterations, the loss suddenly explode and the predict image became un-identifiable.
ps: the loss value show in picture is the sum of last 100 iterations
In order to run this dataset, I do the following modifications:
output_size = (64, 64)
in this line andsmall_window_size = (11, 11)
to match the [12, 22, 22, 512] size feature out of Softsplit.no_dis: 1
in config file to not using the adversarial loss and gan_loss, I thought it may cause training unstable so I dismiss itthe predict result at the loss-explosion iteration is like ps: the first row: first 7 pic is local frames and latter 5 pic are non-local image; second row is correspinding GT. 3rd row is model's prediction
Does I mistakenly modified the param in TimeFocalTransformer? Have u guys have simiar issue and how u solve it, thanks.