Open MationPlays opened 2 years ago
I found this repo https://github.com/awaelchli/stylegan2-pytorch-lightning/blob/master/train.py stylegan in lightning. Can I just in the dataloader position in the train.py?
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased", fast=True)
def process_rows(columns_dict):
tokens = tokenizer(columns_dict["readme"], padding=True, truncation=True, return_tensors="pt").data
columns_dict.update(tokens)
# Remove unwanted columns
[columns_dict.pop(column) for column in dict(columns_dict) if column not in ["token_type_ids", "attention_mask", "input_ids", "target"]]
return columns_dict
def train_dataloader(self):
args = self.hparams
transform = transforms.Compose(
[
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5), inplace=True),
]
)
#ADDED PARQUET DATALOADER CODE
dataset = IterableParquetDataset("example.parquet", process_rows)
dataloader = DataLoader(dataset, num_workers=4)
#dataset = MultiResolutionDataset(args.path, transform, args.size)
#dataloader = data.DataLoader(
# dataset,
# shuffle=True,
# batch_size=args.batch_size,
# drop_last=True,
# num_workers=args.num_workers,
#)
return dataloader
Hello, I want to use parquet files with the stylegan2-ada-pytorch implementation. Do I have to implement stylegan2 first in lighning module so I can use this dataloader? I never used Lightning and the repo structure of stylegan2 is not that easy. Actually I just want the stylegan to accept parquet as datasource without using petastorm