airctic / icevision

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
https://airctic.github.io/icevision/
Apache License 2.0
849 stars 150 forks source link

ValueError: Caught ValueError in DataLoader worker process 0. #365

Closed arhamzamindar closed 4 years ago

arhamzamindar commented 4 years ago

🚀 Feature

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

arhamzamindar commented 4 years ago

i am using this project for resume parser having 15k images. when using voc format and training i get this error in lr.finder

arhamzamindar commented 4 years ago

Also without doing any processing and augmentation on images i get this error "RuntimeError: DataLoader worker (pid(s) 1983, 1984, 1985, 1986) exited unexpectedly"

oke-aditya commented 4 years ago

@arhamzamindar Are you running on CPU ?. Maybe setting num_workers=0 might help.

Also, please share the entire stack trace and your system details. We can debug it further 👍

arhamzamindar commented 4 years ago

yes i am running on cpu. i tried using num_workers =0 it trained for some time and then i got this error "ValueError: x_max is less than or equal to x_min for bbox (0.7262745098039216, 0.34545454545454546, 0.05215686274509804, 0.26, 2)."

i am using this code to create Resume parser from mantisshrimp.all import *

WARNING: Make sure you have already cloned the raccoon dataset using the command shown here above

Set images and annotations directories

data_dir = Path('/home/arham/object-detection-for-resume-using-mantisshrimp/data') images_dir = data_dir / "IMG" annotations_dir = data_dir / "ann"

Define class_map

class_map = ClassMap(["education","work experience","projects","achievements","certificates","background"])

Parser: Use mantisshrimp predefined VOC parser

parser = parsers.voc( annotations_dir=annotations_dir, images_dir=images_dir, class_map=class_map )

train and validation records

data_splitter = RandomSplitter([0.8, 0.2]) train_records, valid_records = parser.parse(data_splitter)

show_records(train_records[:3], n

train_ds = Dataset(train_records, train_tfms) valid_ds = Dataset(valid_records, valid_tfms)

train_dl = efficientdet.train_dl(train_ds, batch_size=16, num_workers=0, shuffle=True) valid_dl = efficientdet.valid_dl(valid_ds, batch_size=16, num_workers=0, shuffle=False)

model = efficientdet.model('tf_efficientdet_lite0', num_classes=len(class_map), img_size=size)

metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]

learn = efficientdet.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics) learn.freeze() learn.lr_find() And after this i am getting the above error

My system details. ubuntu 18 using wsl1 ram 8gb nvidia gpu gtx1050 4gb

Please help my further with my problem

oke-aditya commented 4 years ago

Your dataset labels should be (xmin, ymin, xmax, ymax)

(0.7262745098039216, 0.34545454545454546, 0.05215686274509804, 0.26)."

which isn't so. it's xmax, ymax, xmin, ymin. I guess you need to re-order in the parser.

Also, you should probably use GPU, and check if it is getting used. It would take lot of time on CPU.

arhamzamindar commented 4 years ago

I am using VOC format in this notebook but i have the csv file and it has correct form like xmin ymin xmax ymax. i think the annotation done is not in the correct form . maybe i'll have to remove the images where there is such problem

 

arhamzamindar commented 4 years ago

Your dataset labels should be (xmin, ymin, xmax, ymax)

(0.7262745098039216, 0.34545454545454546, 0.05215686274509804, 0.26)."

which isn't so. it's xmax, ymax, xmin, ymin. I guess you need to re-order in the parser.

Also, you should probably use GPU, and check if it is getting used. It would take lot of time on CPU.

@oke-aditya I removed the files where xmin was greater than xmax and similar for ymin and ymax but still i got the same error. "ValueError: x_max is less than or equal to x_min for bbox ."

tugot17 commented 4 years ago

I face similar issue using kaggle GPU. After the 1st epoch the process dies.

I tried using num_workser=0 but it doesn't help.

def train_dataloader(self):
     faster_rcnn.train_dl(self.trainset, batch_size=self.batch_size, num_workers=0, shuffle=True)

I use pytorch lightning 0.9.1rc1 and I train using DataModule class from the newest pl version

oke-aditya commented 4 years ago

@lgvaz @ai-fast-track . This might need a fix.

lgvaz commented 4 years ago

@tugot17 what was the error that pops up when you use num_workers=0? I'm guessing your error might be different from what @arhamzamindar is describing

lgvaz commented 4 years ago

@arhamzamindar that error is a simple assertion to make sure the bbox is valid, if you're still getting that error it means that there is still a single sample were the values are incorrect.

Could you post the full error you got in the last run?

Also, when posting code, don't forget to surround it with ``` for highlight

arhamzamindar commented 4 years ago

@lgvaz yes so this is the error i get when doing lr.fi


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-c232684d32d4> in <module>
      1 learn.freeze()
----> 2 learn.lr_find()

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/callback/schedule.py in lr_find(self, start_lr, end_lr, num_it, stop_div, show_plot, suggestions)
    226     n_epoch = num_it//len(self.dls.train) + 1
    227     cb=LRFinder(start_lr=start_lr, end_lr=end_lr, num_it=num_it, stop_div=stop_div)
--> 228     with self.no_logging(): self.fit(n_epoch, cbs=cb)
    229     if show_plot: self.recorder.plot_lr_find()
    230     if suggestions:

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    450         init_args.update(log)
    451         setattr(inst, 'init_args', init_args)
--> 452         return inst if to_return else f(*args, **kwargs)
    453     return _f
    454 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    203                     try:
    204                         self.epoch=epoch;          self('begin_epoch')
--> 205                         self._do_epoch_train()
    206                         self._do_epoch_validate()
    207                     except CancelEpochException:   self('after_cancel_epoch')

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/learner.py in _do_epoch_train(self)
    175         try:
    176             self.dl = self.dls.train;                        self('begin_train')
--> 177             self.all_batches()
    178         except CancelTrainException:                         self('after_cancel_train')
    179         finally:                                             self('after_train')

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/learner.py in all_batches(self)
    153     def all_batches(self):
    154         self.n_iter = len(self.dl)
--> 155         for o in enumerate(self.dl): self.one_batch(*o)
    156 
    157     def one_batch(self, i, b):

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/data/load.py in __iter__(self)
     96         self.randomize()
     97         self.before_iter()
---> 98         for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
     99             if self.device is not None: b = to_device(b, self.device)
    100             yield self.after_batch(b)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __next__(self)
    361 
    362     def __next__(self):
--> 363         data = self._next_data()
    364         self._num_yielded += 1
    365         if self._dataset_kind == _DatasetKind.Iterable and \

~/anaconda3/envs/mantis/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _next_data(self)
    401     def _next_data(self):
    402         index = self._next_index()  # may raise StopIteration
--> 403         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    404         if self._pin_memory:
    405             data = _utils.pin_memory.pin_memory(data)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
     32                 raise StopIteration
     33         else:
---> 34             data = next(self.dataset_iter)
     35         return self.collate_fn(data)
     36 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/data/load.py in create_batches(self, samps)
    105         self.it = iter(self.dataset) if self.dataset is not None else None
    106         res = filter(lambda o:o is not None, map(self.do_item, samps))
--> 107         yield from map(self.do_batch, self.chunkify(res))
    108 
    109     def new(self, dataset=None, cls=None, **kwargs):

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastcore/utils.py in chunked(it, chunk_sz, drop_last, n_chunks)
    297     if not isinstance(it, Iterator): it = iter(it)
    298     while True:
--> 299         res = list(itertools.islice(it, chunk_sz))
    300         if res and (len(res)==chunk_sz or not drop_last): yield res
    301         if len(res)<chunk_sz: return

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/data/load.py in do_item(self, s)
    118     def prebatched(self): return self.bs is None
    119     def do_item(self, s):
--> 120         try: return self.after_item(self.create_item(s))
    121         except SkipItemException: return None
    122     def chunkify(self, b): return b if self.prebatched else chunked(b, self.bs, self.drop_last)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/fastai2/data/load.py in create_item(self, s)
    124     def randomize(self): self.rng = random.Random(self.rng.randint(0,2**32-1))
    125     def retain(self, res, b):  return retain_types(res, b[0] if is_listy(b) else b)
--> 126     def create_item(self, s):  return next(self.it) if s is None else self.dataset[s]
    127     def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b)
    128     def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/mantisshrimp/data/dataset.py in __getitem__(self, i)
     33         data = self.prepare_record(self.records[i])
     34         if self.tfm is not None:
---> 35             data = self.tfm(data)
     36         return data
     37 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/mantisshrimp/tfms/transform.py in __call__(self, data)
     11     def __call__(self, data: dict):
     12         data = data.copy()
---> 13         tfmed = self.apply(**data)
     14         data.update(tfmed)
     15         return data

~/anaconda3/envs/mantis/lib/python3.8/site-packages/mantisshrimp/tfms/albumentations/tfms.py in apply(self, img, labels, bboxes, masks, iscrowds, **kwargs)
    108             params["masks"] = masks.data
    109 
--> 110         d = self.tfms(**params)
    111 
    112         out = {"img": d["image"]}

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/core/composition.py in __call__(self, force_apply, **data)
    172             if dual_start_end is not None and idx == dual_start_end[0]:
    173                 for p in self.processors.values():
--> 174                     p.preprocess(data)
    175 
    176             data = t(force_apply=force_apply, **data)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/core/utils.py in preprocess(self, data)
     60         rows, cols = data["image"].shape[:2]
     61         for data_name in self.data_fields:
---> 62             data[data_name] = self.check_and_convert(data[data_name], rows, cols, direction="to")
     63 
     64     def check_and_convert(self, data, rows, cols, direction="to"):

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/core/utils.py in check_and_convert(self, data, rows, cols, direction)
     68 
     69         if direction == "to":
---> 70             return self.convert_to_albumentations(data, rows, cols)
     71 
     72         return self.convert_from_albumentations(data, rows, cols)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/augmentations/bbox_utils.py in convert_to_albumentations(self, data, rows, cols)
     49 
     50     def convert_to_albumentations(self, data, rows, cols):
---> 51         return convert_bboxes_to_albumentations(data, self.params.format, rows, cols, check_validity=True)
     52 
     53 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/augmentations/bbox_utils.py in convert_bboxes_to_albumentations(bboxes, source_format, rows, cols, check_validity)
    301     """Convert a list bounding boxes from a format specified in `source_format` to the format used by albumentations
    302     """
--> 303     return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
    304 
    305 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/augmentations/bbox_utils.py in <listcomp>(.0)
    301     """Convert a list bounding boxes from a format specified in `source_format` to the format used by albumentations
    302     """
--> 303     return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
    304 
    305 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/augmentations/bbox_utils.py in convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity)
    249     bbox = normalize_bbox(bbox, rows, cols)
    250     if check_validity:
--> 251         check_bbox(bbox)
    252     return bbox
    253 

~/anaconda3/envs/mantis/lib/python3.8/site-packages/albumentations/augmentations/bbox_utils.py in check_bbox(bbox)
    332     x_min, y_min, x_max, y_max = bbox[:4]
    333     if x_max <= x_min:
--> 334         raise ValueError("x_max is less than or equal to x_min for bbox {bbox}.".format(bbox=bbox))
    335     if y_max <= y_min:
    336         raise ValueError("y_max is less than or equal to y_min for bbox {bbox}.".format(bbox=bbox))

ValueError: x_max is less than or equal to x_min for bbox (0.5602579604997985, 0.6431014823261118, 0.09875050382910117, 0.6080387685290763, 1).```
lgvaz commented 4 years ago

The error is very clear @arhamzamindar, and notice that it's being thrown by albumentations, x_max is 0.56 and x_min is 0.098.

Your data is either invalid or your misunderstanding it's format.

arhamzamindar commented 4 years ago

The error is very clear @arhamzamindar, and notice that it's being thrown by albumentations, x_max is 0.56 and x_min is 0.098.

Your data is either invalid or your misunderstanding it's format.

@lgvaz i am adding other errors i encountered on the same code ValueError: y_max is less than or equal to y_min for bbox (0.07900040306328093, 0.6743655546050755, 0.9729947601773479, 0.0290846877673225, 1). ValueError: y_max is less than or equal to y_min for bbox (0.07496977025392987, 0.5493158494868872, 0.9447803305118904, 0.38683010262257694, 1).

This i received after i tried removing files where minimum was greater than large for both x any y. Also can you guide me how to change my data(probably XML'S) so i can run use mantis shrimp coz this is best to do object detection and I don't want to opt for some other library.

lgvaz commented 4 years ago

It have the impression that your dataset might be in the (x, y, width, height) format, is that a possibity?

oke-aditya commented 4 years ago

High chance, Efficient det takes (x y w h) input format most probably. And the transforms of albumentations to it should be subsequently applied.

lgvaz commented 4 years ago

@oke-aditya it does not depend on the model though, the library internally figures out the right format to pass to the models, what I think is wrong is his parser.

If I understood correctly he's using the VOC parser, which expects bboxes to be in xyxy, but maybe his dataset is xywh?

arhamzamindar commented 4 years ago

It have the impression that your dataset might be in the (x, y, width, height) format, is that a possibity?

YES,this may be a possibility because we did annotation on dataturks that provides output in json format which had 8 co-ordinates. My colleague took the 1st 4th 5th and 6th co-ordinates coz bbox was drawn with that co-ordinates. We also tried training on Fastai but that rejected most of our images and only ran on a few of them(reason for that is not known). I am trying to edit the XML files and sort out the issue.

lgvaz commented 4 years ago

Keep in mind that if that is the case, you don't need to edit your data, but only to subclass VocXmlParser and override the bboxes method to fit your data.

You would probably change something here (parsers->voc_parser.py):

    def bboxes(self, o) -> List[BBox]:
        def to_int(x):
            return int(float(x))

        bboxes = []
        for object in self._root.iter("object"):
            xml_bbox = object.find("bndbox")
            xmin = to_int(xml_bbox.find("xmin").text)
            ymin = to_int(xml_bbox.find("ymin").text)
            xmax = to_int(xml_bbox.find("xmax").text)
            ymax = to_int(xml_bbox.find("ymax").text)

            bbox = BBox.from_xyxy(xmin, ymin, xmax, ymax)
            bboxes.append(bbox)

        return bboxes
arhamzamindar commented 4 years ago

@lgvaz i changed my xml file but now in the data_splitter i get this error.


<ipython-input-20-d5e78574de5f> in <module>
      1 data_splitter = RandomSplitter((.8, .2))
----> 2 train_records, valid_records = parser.parse(data_splitter)

~/anaconda3/envs/mantis/lib/python3.8/site-packages/mantisshrimp/parsers/parser.py in parse(self, data_splitter, idmap, show_pbar)
     85         idmap = idmap or IDMap()
     86         data_splitter = data_splitter or SingleSplitSplitter()
---> 87         records = self.parse_dicted(show_pbar=show_pbar, idmap=idmap)
     88         splits = data_splitter(idmap=idmap)
     89         return [[{"imageid": id, **records[id]} for id in ids] for ids in splits]

~/anaconda3/envs/mantis/lib/python3.8/site-packages/mantisshrimp/parsers/parser.py in parse_dicted(self, idmap, show_pbar)
     56 
     57             for name, func in annotation_parse_funcs.items():
---> 58                 records[imageid][name].extend(func(sample))
     59 
     60         # check that all annotations have the same length

~/anaconda3/envs/mantis/lib/python3.8/site-packages/mantisshrimp/parsers/voc_parser.py in labels(self, o)
     67         labels = []
     68         for object in self._root.iter("object"):
---> 69             label = object.find("name").text
     70             label_id = self.class_map.get_name(label)
     71             labels.append(label_id)

AttributeError: 'NoneType' object has no attribute 'text'```

PLEASE HELP
MohiteAkshay commented 4 years ago

I am a newbee and maybe this is a noob query.I read the thread and checked my xmin, ymin, xmax , x max, y max used BBox.fromxywh there boxes doesnt fit the image object whereas when i use BBox.fromxyxy in show_samples i can see the bounding boxes perfectly fit the image objects as well. When i am using lr_find it threw me an error the first time i ran it (error pasted below). I ran the code again without changing anything lr_find was running but then fine_tune threw the same error. Attached the code.( I am running the whole thing on colab)

Dataset:- https://github.com/gulvarol/grocerydataset

grocery2.pdf

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-c232684d32d4> in <module>()
      1 learn.freeze()
----> 2 learn.lr_find()

9 frames
/usr/local/lib/python3.6/dist-packages/torch/_utils.py in reraise(self)
    393             # (https://bugs.python.org/issue2651), so we work around it.
    394             msg = KeyErrorMessage(msg)
--> 395         raise self.exc_type(msg)

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 34, in fetch
    data = next(self.dataset_iter)
  File "/usr/local/lib/python3.6/dist-packages/fastai2/data/load.py", line 107, in create_batches
    yield from map(self.do_batch, self.chunkify(res))
  File "/usr/local/lib/python3.6/dist-packages/fastcore/utils.py", line 299, in chunked
    res = list(itertools.islice(it, chunk_sz))
  File "/usr/local/lib/python3.6/dist-packages/fastai2/data/load.py", line 120, in do_item
    try: return self.after_item(self.create_item(s))
  File "/usr/local/lib/python3.6/dist-packages/fastai2/data/load.py", line 126, in create_item
    def create_item(self, s):  return next(self.it) if s is None else self.dataset[s]
  File "/usr/local/lib/python3.6/dist-packages/icevision/data/dataset.py", line 38, in __getitem__
    data = self.tfm(data)
  File "/usr/local/lib/python3.6/dist-packages/icevision/tfms/transform.py", line 13, in __call__
    tfmed = self.apply(**data)
  File "/usr/local/lib/python3.6/dist-packages/icevision/tfms/albumentations/tfms.py", line 110, in apply
    d = self.tfms(**params)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/core/composition.py", line 174, in __call__
    p.preprocess(data)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/core/utils.py", line 62, in preprocess
    data[data_name] = self.check_and_convert(data[data_name], rows, cols, direction="to")
  File "/usr/local/lib/python3.6/dist-packages/albumentations/core/utils.py", line 70, in check_and_convert
    return self.convert_to_albumentations(data, rows, cols)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 51, in convert_to_albumentations
    return convert_bboxes_to_albumentations(data, self.params.format, rows, cols, check_validity=True)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 303, in convert_bboxes_to_albumentations
    return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 303, in <listcomp>
    return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 251, in convert_bbox_to_albumentations
    check_bbox(bbox)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 330, in check_bbox
    "to be in the range [0.0, 1.0], got {value}.".format(bbox=bbox, name=name, value=value)
ValueError: Expected y_max for bbox (0.33458646616541354, 0.9720394736842105, 0.38721804511278196, 1.0674342105263157, 0) to be in the range [0.0, 1.0], got 1.0674342105263157.
oke-aditya commented 4 years ago

Reason of error is It expects normalized boxes. It has to be between 0 and 1.0.

MohiteAkshay commented 4 years ago

I have used normalization in transforms

MohiteAkshay commented 4 years ago

Reason of error is It expects normalized boxes. It has to be between 0 and 1.0.

Also i have attached the entire code. Please let me know where i am going wrong

lgvaz commented 4 years ago

The error is a bit cryptic because it's internally being thrown by albumentations.

Reason of error is It expects normalized boxes. It has to be between 0 and 1.0.

To clarify the quote here above the error is not that icevision expects normalized bboxes, the normalization actually happens inside albumentations.

The cause of this error is that there is at least one bbox in your dataset that has a coordinate that is bigger than the image dimensions.

What you will need to do is to clip the bboxes coordinates so it stays within the image dimensions.

Some steps that might help you:


Note: This is not related to the current error but, the fields width and height on the parser are supposed to be the image dimensions, #455 was opened for clarification.

lgvaz commented 4 years ago

The issue got stale so I believe it's solved, I'll be closing it

gdineshk6174 commented 3 years ago

I am a newbee and maybe this is a noob query.I read the thread and checked my xmin, ymin, xmax , x max, y max used BBox.fromxywh there boxes doesnt fit the image object whereas when i use BBox.fromxyxy in show_samples i can see the bounding boxes perfectly fit the image objects as well. When i am using lr_find it threw me an error the first time i ran it (error pasted below). I ran the code again without changing anything lr_find was running but then fine_tune threw the same error. Attached the code.( I am running the whole thing on colab)

Dataset:- https://github.com/gulvarol/grocerydataset

grocery2.pdf

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-c232684d32d4> in <module>()
      1 learn.freeze()
----> 2 learn.lr_find()

9 frames
/usr/local/lib/python3.6/dist-packages/torch/_utils.py in reraise(self)
    393             # (https://bugs.python.org/issue2651), so we work around it.
    394             msg = KeyErrorMessage(msg)
--> 395         raise self.exc_type(msg)

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 34, in fetch
    data = next(self.dataset_iter)
  File "/usr/local/lib/python3.6/dist-packages/fastai2/data/load.py", line 107, in create_batches
    yield from map(self.do_batch, self.chunkify(res))
  File "/usr/local/lib/python3.6/dist-packages/fastcore/utils.py", line 299, in chunked
    res = list(itertools.islice(it, chunk_sz))
  File "/usr/local/lib/python3.6/dist-packages/fastai2/data/load.py", line 120, in do_item
    try: return self.after_item(self.create_item(s))
  File "/usr/local/lib/python3.6/dist-packages/fastai2/data/load.py", line 126, in create_item
    def create_item(self, s):  return next(self.it) if s is None else self.dataset[s]
  File "/usr/local/lib/python3.6/dist-packages/icevision/data/dataset.py", line 38, in __getitem__
    data = self.tfm(data)
  File "/usr/local/lib/python3.6/dist-packages/icevision/tfms/transform.py", line 13, in __call__
    tfmed = self.apply(**data)
  File "/usr/local/lib/python3.6/dist-packages/icevision/tfms/albumentations/tfms.py", line 110, in apply
    d = self.tfms(**params)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/core/composition.py", line 174, in __call__
    p.preprocess(data)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/core/utils.py", line 62, in preprocess
    data[data_name] = self.check_and_convert(data[data_name], rows, cols, direction="to")
  File "/usr/local/lib/python3.6/dist-packages/albumentations/core/utils.py", line 70, in check_and_convert
    return self.convert_to_albumentations(data, rows, cols)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 51, in convert_to_albumentations
    return convert_bboxes_to_albumentations(data, self.params.format, rows, cols, check_validity=True)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 303, in convert_bboxes_to_albumentations
    return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 303, in <listcomp>
    return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 251, in convert_bbox_to_albumentations
    check_bbox(bbox)
  File "/usr/local/lib/python3.6/dist-packages/albumentations/augmentations/bbox_utils.py", line 330, in check_bbox
    "to be in the range [0.0, 1.0], got {value}.".format(bbox=bbox, name=name, value=value)
ValueError: Expected y_max for bbox (0.33458646616541354, 0.9720394736842105, 0.38721804511278196, 1.0674342105263157, 0) to be in the range [0.0, 1.0], got 1.0674342105263157.

there are some corrupt images in the grocery dataset. somewhere between image_id 8000- 8500 images in the "train folder",drop those.