mapbox / robosat

Semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water, clouds
MIT License
2.01k stars 382 forks source link

Augmented data is stopping weight calculation and training. #224

Open manapshymyr-OB opened 2 years ago

manapshymyr-OB commented 2 years ago

@daniel-j-h Hello! I am working with Sentinel-2 data and successfully trained with 0.56 accuracy. Now I want to increase training data so I am trying data augmentation (rotation, zooming). I rotated and zoomed current data (I did the same for labels) with Keras library (ImageDataGenerator). New data has appropriate z/x/y names (continued from the last number of x). While calculating weights when it comes to new data I am getting:

Traceback (most recent call last):
  File "weights.py", line 69, in <module>
    main(args.dataset)
  File "weights.py", line 47, in main
    counts += np.bincount(image.ravel(), minlength=num_classes)
ValueError: operands could not be broadcast together with shapes (2,) (142,) (2,)

What will be the reason? How should I solve it? Furthermore, I noticed that original data represents binary data (print (image)), while new data has values up to 255. What is a problem?

When starting the training without calculating weights I am getting:

/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = -1]: block: [41,0,0], thread: [98,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/src/app/robosat/tools/__main__.py", line 58, in <module>
    args.func(args)
  File "/usr/src/app/robosat/tools/train.py", line 129, in main
    train_hist = train(train_loader, num_classes, device, net, optimizer, criterion)
  File "/usr/src/app/robosat/tools/train.py", line 185, in train
    loss = criterion(outputs, masks)
  File "/opt/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/src/app/robosat/losses.py", line 106, in forward
    errors_sorted, indices = torch.sort(max_margin_errors, descending=True)
RuntimeError: merge_sort: failed to synchronize: device-side assert triggered
daniel-j-h commented 2 years ago

Probably issue in your image dataset.

Please don't @ tag me. I'm neither with Mapbox nor working or maintaining this project anymore.

Thanks.