Closed vict0rsch closed 5 years ago
To reproduce:
config/test_run.json:
{
"model": {
"n_blocks": 5,
"filter_factors": null,
"kernel_size": 3,
"dropout": 0.75,
"Cin": 42,
"Cout": 3,
"disc_size": 60
},
"train": {
"n_epochs": 100,
"lr_d": 0.0002,
"lr_g": 0.00005,
"lambda_gan": 0.01,
"lambda_L": 1,
"batch_size": 12,
"n_epoch_regress": 100,
"n_epoch_gan": 250,
"datapath": "$DATADIR",
"n_in_mem": 20,
"early_break_epoch": 0,
"load_limit": -1,
"num_workers": 3,
"num_D_accumulations": 1,
"matching_loss": "l2"
}
}
$ export DATADIR=/network/tmp1/schmidtv/clouds500
$ ipython
In [1]: run src/train.py -c test_run -o . --no_exp
... stop training after a few steps with ctrl+c
In [2]: for batch in trainer.trainloader:
...: break
In [3]: batch["real_imgs"].sum()
Must have to do with
coords[np.isnan(coords)] = 0.0
coords[np.isinf(coords)] = 0.0
real_imgs[np.isnan(real_imgs)] = 0.0
real_imgs[np.isinf(real_imgs)] = 0.0
metos[np.isnan(metos)] = 0.0
metos[np.isinf(metos)] = 0.0
seems the merge hasn't resolved all conflicts I will handle this
@vict0rsch does this happen with_stats = True ?
Does not with with_stats=false
I have tested the Images with_states = true within the training loop and this test and the image is there, can you check it again !
from src.preprocessing import Rescale
from src.data import EarthData
import torch
data_path = "/scratch/sankarak/data/clouds/"
trainset = EarthData(
data_path,
transform=Rescale(data_path, n_in_mem=20, num_workers=12),
n_in_mem=50
)
trainloader = torch.utils.data.DataLoader(
trainset,
batch_size=50,
shuffle=False,
num_workers=3,
)
batch_0 = next(iter(trainloader))["real_imgs"][0]
print("test value =", batch_0.sum())
-------
output --> (tensor(-6381.4736))
maybe it has to do with the data I have because on the subset it does not work :(
Investigating, I got this:
batch["real_imgs"].sum()