libffcv / ffcv-imagenet

Train ImageNet *fast* in 500 lines of code with FFCV
Apache License 2.0
136 stars 34 forks source link

A complete example for imagenet data loading #2

Closed gd-zhang closed 2 years ago

gd-zhang commented 2 years ago

I've been trying to use your FFCV data loader for imagenet training. I find the provided example hard to follow as you use progressive resizing. I wonder if you could provide a complete example with the most commonly used resolution 224.

I have also coded it up myself, but I found the validation accuracy is significantly lower than the training accuracy in my case (see attached code snippet below). For example, after 3 epochs, the training ACC is around 40%, but the validation is only 15%.

def get_ffcv_trainloader(train_dataset, device, batch_size, num_workers=12, in_memory=True):
    train_path = Path(train_dataset)
    assert train_path.is_file()

    decoder = RandomResizedCropRGBImageDecoder((224, 224))
    image_pipeline: List[Operation] = [
        decoder,
        RandomHorizontalFlip(),
        ToTensor(),
        ToDevice(device, non_blocking=True),
        ToTorchImage(),
        NormalizeImage(IMAGENET_MEAN, IMAGENET_STD, np.float32)
    ]

    label_pipeline: List[Operation] = [
        IntDecoder(),
        ToTensor(),
        Squeeze(),
        ToDevice(device, non_blocking=True)
    ]

    order = OrderOption.QUASI_RANDOM
    loader = Loader(train_dataset,
                    batch_size=batch_size,
                    num_workers=num_workers,
                    order=order,
                    os_cache=in_memory,
                    drop_last=True,
                    pipelines={
                        'image': image_pipeline,
                        'label': label_pipeline
                    })

    return loader

def get_ffcv_valloader(val_dataset, device, batch_size, num_workers=12):
    val_path = Path(val_dataset)
    assert val_path.is_file()
    cropper = CenterCropRGBImageDecoder((224, 224), ratio=224/256)
    image_pipeline = [
        cropper,
        ToTensor(),
        ToDevice(device, non_blocking=True),
        ToTorchImage(),
        NormalizeImage(IMAGENET_MEAN, IMAGENET_STD, np.float32)
    ]

    label_pipeline = [
        IntDecoder(),
        ToTensor(),
        Squeeze(),
        ToDevice(device, non_blocking=True)
    ]

    loader = Loader(val_dataset,
                    batch_size=batch_size,
                    num_workers=num_workers,
                    order=OrderOption.SEQUENTIAL,
                    drop_last=False,
                    pipelines={
                        'image': image_pipeline,
                        'label': label_pipeline
                    })
    return loader
lengstrom commented 2 years ago

Hi, you can pass the parameters --resolution.min_res=224 --resolution.max_res=224 to the script and you will get 224px training.

Not sure what is going on in the script you've posted, but these settings will certainly work with the ffcv training script in this repo. If you have data loading issues you should make an issue on the ffcv repository.