Open rachelglenn opened 3 months ago
Hi @rachelglenn,
Thank you for reaching out. I'm afraid that you may hit the DALI limitation, however, before we rule other issues out please share a simple code snip we can run on our end that will illustrate your approach and reproduce the problem.
Here is what I can put together as an example. I hope that I didn't make any small typos
import cupy as cp
import imageio
class model_data(NamedTuple):
image: torch.Tensor
lable: torch.Tensor
filename: str
class ExternalInputGpuIterator(object):
def __init__(self, batch_size):
self.images_dir = "../../data/images/"
self.batch_size = batch_size
with open(self.images_dir + "file_list.txt", "r") as f:
self.files = [line.rstrip() for line in f if line != ""]
shuffle(self.files)
def __iter__(self):
self.i = 0
self.n = len(self.files)
return self
def __next__(self):
batch = []
labels = []
filenames = []
for _ in range(self.batch_size):
jpeg_filename, label = self.files[self.i].split(" ")
im = imageio.imread(self.images_dir + jpeg_filename)
im = cp.asarray(im)
im = im * 0.6
self.i = (self.i + 1) % self.n
model_data(im.astype(cp.uint8), cp.array([label], dtype=np.uint8), self.files[self.i].split(" "))
batch.append(model_data)
return batch
eii_gpu = ExternalInputGpuIterator(batch_size)
pipe_gpu = Pipeline(batch_size=batch_size, num_threads=2, device_id=0)
with pipe_gpu:
model_data = fn.external_source(source=eii_gpu, device="gpu", )
model_data.image= fn.brightness_contrast(model_data.image, contrast=2)
pipe_gpu.set_outputs(model_data)
train_loader = DALIGenericIterator(pipeline, ["model_data"])
Hi @rachelglenn,
Thank you for providing the code snippet. However, I get multiple errors running it. Can you please check it on your end?
Yes, I am not surprised. I am not able to get it to work. This is why I am asking for help of how to use a named Tuple in the datatype for the pipeline. Can you help provide an example using:
class model_data(NamedTuple):
image: torch.Tensor
lable: torch.Tensor
filename: str
@rachelglenn,
I get errors not related to the issue you raised, for example:
class model_data(NamedTuple):
NameError: name 'NamedTuple' is not defined
After adding:
from collections import namedtuple
import torch```
I get
class model_data(namedtuple): TypeError: function() argument 'code' must be code, not str
and I'm not sure if I'm running the same code as you anymore. Please update the provided snipped in a way that will show the mentioned error.
Describe the question.
I am following the example for external input to the dali loader. My datatype going to my model is a NamedTuple. When I try to create the dataloader:
dataloader = DALIGenericIterator(pipeline, ["image"])
I get an error associated with my NamedTuple type:TypeError: Illegal pipeline output type. The output 0 contains a nested
DataNode. Missing list/tuple expansion (*) is the likely cause.
I am not sure how the Dali loader can accept a NamedTuple type. Is it possible? I am not sure what to put for the second argument in the creation of the dataloader iterator (DALIGenericIterator).
Thanks for the help.
Check for duplicates