roboflow / notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
https://roboflow.com/models
5.38k stars 844 forks source link

YOLOv8 Training Cell Error Encountered TypeError: unhashable type: 'numpy.ndarray' #317

Open DavidGuamanDavila opened 1 week ago

DavidGuamanDavila commented 1 week ago

Search before asking

Notebook name

The notebook I am facing this issue with is the YOLOv8 Training Notebook

Bug

When executing the following in cell:

Screenshot 2024-10-04 at 10 47 19 PM

The following is the bug:

Transferred 349/355 items from pretrained weights
TensorBoard: Start with 'tensorboard --logdir runs/detect/train', view at http://localhost:6006/
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
/usr/local/lib/python3.10/dist-packages/ultralytics/nn/tasks.py:567: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(file, map_location='cpu'), file  # load
/usr/local/lib/python3.10/dist-packages/ultralytics/utils/checks.py:558: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.
  with torch.cuda.amp.autocast(True):
AMP: checks passed ✅
/usr/local/lib/python3.10/dist-packages/ultralytics/engine/trainer.py:238: FutureWarning: torch.cuda.amp.GradScaler(args...) is deprecated. Please use torch.amp.GradScaler('cuda', args...) instead.
  self.scaler = amp.GradScaler(enabled=self.amp)
train: Scanning /content/datasets/Roboboat-2024-Marine-Markers-3/train/labels... 320 images, 92 backgrounds, 0 corrupt: 100% 320/320 [00:00<00:00, 1925.59it/s]
train: New cache created: /content/datasets/Roboboat-2024-Marine-Markers-3/train/labels.cache
/usr/local/lib/python3.10/dist-packages/albumentations/__init__.py:13: UserWarning: A new version of Albumentations is available: 1.4.17 (you have 1.4.15). Upgrade using: pip install -U albumentations. To disable automatic update checks, set the environment variable NO_ALBUMENTATIONS_UPDATE to 1.
  check_for_updates()
/usr/local/lib/python3.10/dist-packages/albumentations/core/composition.py:191: UserWarning: Got processor for bboxes, but no transform to process it.
  self._set_keys()
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))
val: Scanning /content/datasets/Roboboat-2024-Marine-Markers-3/valid/labels... 89 images, 33 backgrounds, 0 corrupt: 100% 89/89 [00:00<00:00, 1557.45it/s]
val: New cache created: /content/datasets/Roboboat-2024-Marine-Markers-3/valid/labels.cache
Plotting labels to runs/detect/train/labels.jpg... 
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
optimizer: AdamW(lr=0.000714, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 800 train, 800 val
Using 2 dataloader workers
Logging results to runs/detect/train
Starting training for 600 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
  0% 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/cfg/__init__.py", line 445, in entrypoint
    getattr(model, mode)(**overrides)  # default args from model
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/engine/model.py", line 341, in train
    self.trainer.train()
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/engine/trainer.py", line 191, in train
    self._do_train(world_size)
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/engine/trainer.py", line 325, in _do_train
    for i, batch in pbar:
  File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/data/build.py", line 42, in __iter__
    yield next(self.iterator)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 706, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/data/base.py", line 242, in __getitem__
    return self.transforms(self.get_image_and_label(index))
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/data/augment.py", line 70, in __call__
    data = t(data)
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/data/augment.py", line 824, in __call__
    new = self.transform(image=im, bboxes=bboxes, class_labels=cls)  # transformed
  File "/usr/local/lib/python3.10/dist-packages/albumentations/core/composition.py", line 334, in __call__
    self.preprocess(data)
  File "/usr/local/lib/python3.10/dist-packages/albumentations/core/composition.py", line 368, in preprocess
    p.preprocess(data)
  File "/usr/local/lib/python3.10/dist-packages/albumentations/core/utils.py", line 125, in preprocess
    data = self.add_label_fields_to_data(data)
  File "/usr/local/lib/python3.10/dist-packages/albumentations/core/utils.py", line 185, in add_label_fields_to_data
    encoded_labels = encoder.fit_transform(data[label_field])
  File "/usr/local/lib/python3.10/dist-packages/albumentations/core/utils.py", line 60, in fit_transform
    self.fit(y)
  File "/usr/local/lib/python3.10/dist-packages/albumentations/core/utils.py", line 48, in fit
    unique_labels = sorted(set(y))
TypeError: unhashable type: 'numpy.ndarray'

It narrows down to this TypeError at the end:

TypeError: unhashable type: 'numpy.ndarray'

After this, the Custom Training cell terminates.

Environment

Minimal Reproducible Example

To reproduce this issue, just execute the cells of the YOLOv8 Colab Training Notebook found on the Roboflow website. When running all the cells until reaching the Custom Training cell:

Custom Training

%cd {HOME}

!yolo task=detect mode=train model=yolov8s.pt data={dataset.location}/data.yaml epochs=550 imgsz=800 plots=True

The training does not proceed as in previous training sessions I completed with the same YOLOv8 notebook, and the error mentioned above appears and terminates the cell runtime.

Additional

I fixed it by installing !pip install albumentations==1.4 during the setup before the Cell Training section. I found this solution worked for other users in a Roboflow Discussion and thought it would be beneficial to update the notebook with this change.

Are you willing to submit a PR?

kingfahad2850 commented 4 days ago

yes thats, am also facing the same issue, for almost three days, and i tried fixing it but all to no avail

LinasKo commented 4 days ago

I'm on it. Expect a fix tonight.

kingfahad2850 commented 4 days ago

running this code in a google colab cell works perfectly, !pip install albumentations==1.4

LinasKo commented 4 days ago

It might depend on what you install first - ultralytics or albumentations. Did you try installing albumentations after ultralytics, when you're running locally?

DavidGuamanDavila commented 3 days ago

The environment in which I was running the notebook was Google Colab. I installed albumentations after Ultralytics, and it worked on Colab I was able to complete my training. I have not tried running the notebook locally.

LinasKo commented 3 days ago

Reopening, as the problem likely exists in more notebooks. I'll test it out more broadly.