keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.42k stars 82 forks source link

About Datasets #20

Closed songbingyue closed 1 year ago

songbingyue commented 1 year ago

Can I replace imagenet1k with another dataset?

keyu-tian commented 1 year ago

Yes, you can see https://github.com/keyu-tian/SparK/tree/main/pretrain#tutorial-for-pretraining-your-own-dataset.

The only thing needed to do is to replace the function build_dataset_to_pretrain in line54-75 of pretrain/utils/imagenet.py to yours. This function should return a Dataset object. You may use args like args.data_path and args.input_size to help build your dataset. And when runing experiment with main.sh you can use --data_path=... --input_size=... to specify them.