Closed alphinside closed 4 years ago
do you mean image padding or batch padding? doesn't torch dataloader padding behaviour depends on collate_fn? @alphinside
image padding, in current implementation default behavior is automatic padding to square, so in collate_fn you didn't need to resize or padding anymore in order to make torch stack of image data
doesn't image padding performed by augments/standard_augments in wrapper, not dataloader?
it is in dataset wrapper actually, probably you mean in create_dataloader or create_dataset?
yes, correct, however note that we also have DALILoader in which the augmentation is not done on the dataset object
note that in PytorchDataLoader, augmentations is in dataset object, in DALIDataLoader, augmentations is in pipeline
so it's in create_dataloader, right?
note that in PytorchDataLoader, augmentations is in dataset object, in DALIDataLoader, augmentations is in pipeline
but after looking it more closely, yes we can put the arguments in the dataset object
so it's in create_dataloader, right?
basically yes, because create_dataset also invoked in create_dataloader and the disable_autopad need to be forwarded to the create_dataset
well it think it should be in create_dataset, putting image padding in create_dataloader doesn't seem right
anyway what have been discussed until this point is implementation details which is quite straightforward, the discussion subject is more about the origin of this disable_auto_pad value. In my proposal I prefer it put on the preprocess_args when the developer develop the model, so the developer fully aware that he is trying to disable the autopad and must develop his own mechanism on the collate_fn cc @alifahrri @triwahyuu
well it think it should be in create_dataset, putting image padding in create_dataloader doesn't seem right
but how we forward the information to create dataset? considering it is called inside the create_dataloader
I'm fine with your proposal
well it think it should be in create_dataset, putting image padding in create_dataloader doesn't seem right
but how we forward the information to create dataset? considering it is called inside the create_dataloader
wait, turns out we have create_dataloader and create_loader
but yeah preprocess_args is visible to dataset
Actually preprocess_args
is not modified by the model_components
. So my previous proposal is invalid. So yeah still looking an idea on how to forward information from developer to the create_dataset
function
it is not returned, but it is mutable btw
it is not returned, but it is mutable btw
yeah you're correct
so it is still possible right?
so it is still possible right?
yes possible, but need to add args to the create_dataloader
and some modifications on model_components i think. currently there's no way to forward information from developer's side and maintain a good sensible arguments. To achieve this, we need to move out the create_dataset
and create_collater
out of create_dataloader
and from that point we can modify that.
So i think this is my workaround
proposal :
dataset = create_dataset(dataset_config=dataset_config,
stage=stage,
preprocess_config=preprocess_config,
wrapper_format=wrapper_format[dataloader_module])
if isinstance(collate_fn,str):
collater_args = {}
try:
collater_args = dataloader_config.collater.args
except:
collater_args = {}
collater_args['dataformat'] = dataset.data_format
collate_fn = create_collater(collate_fn, **collater_args)
# Re-initialize dataset (Temp workaround)
if hasattr(collate_fn,'disable_image_auto_padding'):
dataset = create_dataset(dataset_config=dataset_config,
stage=stage,
preprocess_config=preprocess_config,
wrapper_format=wrapper_format[dataloader_module],
disable_image_auto_padding=collate_fn.disable_image_auto_padding)
Is your feature request related to a problem? Please describe. On issue #43 , DETR training involved their own mechanism to pad the image and building the mask. Somehow having this option means we can follow its mechanism
Describe the solution you'd like Still looking for some ways developer can provide information to 'create_dataset' function