Open jwitos opened 3 years ago
Hey, @jwitos, thanks for the report!
dataset.py
? AutoAlbument expects that images and masks returned by that dataset should have the shape [height, width, num_channels]
. Then AutoAlbument will create a transformation function using this method. This function contains the ToTensorV2 transform from Albumentations. The purpose of that transformation is to change NumPy array dimensions from [height, width, num_channels]
to [num_channels, height, width]
and then convert it to a PyTorch Tensor (so basically convert a regular NumPy array with image or mask to a format expected by PyTorch). The dataset implementation should use that transform function for all images and mask that it returns (e.g., https://github.com/albumentations-team/autoalbument/blob/master/examples/pascal_voc/dataset.py#L86)Also, I think that comment "If an image contains three color channels" could be rephrases -- it suggests that e.g. single-channel images are accepted, but in fact currently input probably always requires 3 channels.
Yes, I will rephrase it, thanks. In fact, it is possible to use single-channel images, but then you need to define a custom model that works with those single-channel images. I am planning to document such an option.
Hi, I noticed two issues with the docs / comments:
_target_: autoalbument.faster_autoaugment.models.SemanticSegmentationModel
instruction withinsemantic_segmentation_model
section. Search fails without that line.dataset.py
file, comments say that "mask
should be a NumPy array with the shape [height, width, num_classes]" and "image
should be a NumPy array with the shape [height, width, num_channels]". Meanwhile, it looks like channels should be first, i.e. [channels, height, width]. This was the only combination that works anyway. Also, I think that comment "If an image contains three color channels" could be rephrases -- it suggests that e.g. single-channel images are accepted, but in fact currently input probably always requires 3 channels.Thanks