microsoft / aerial_wildlife_detection

Tools for detecting wildlife in aerial images using active learning
MIT License
224 stars 58 forks source link

Detectron error #59

Open VLucet opened 1 year ago

VLucet commented 1 year ago

Hi Again! I managed to install AIDE , It's looking great but I'm running into an issue. I uploaded about 200 images to test the auto training feature and started labelling with bounding boxes.

I run into the following issue:

[2022-09-21 13:38:36,671: WARNING/ForkPoolWorker-29] Assembled training images into 1 chunks (length of first: 128)
[2022-09-21 13:38:37,390: WARNING/ForkPoolWorker-28] [TP] Updating model to incorporate potentially new label classes...
[2022-09-21 13:38:37,402: WARNING/ForkPoolWorker-28] [TP] Model auto-update disabled; skipping...
[2022-09-21 13:38:38,222: WARNING/ForkPoolWorker-31] [TP] Epoch 1: Initiated training...
[2022-09-21 13:38:41,804: WARNING/ForkPoolWorker-31] WARNING: encountered unknown label classes: e75a562b-39ce-11ed-a5c4-d7f10ba73e16, e75a562a-39ce-11ed-a5c4-d7f10ba73e16, 3a0ec21b-39d0-11ed-a5c4-d7f10ba73e16, e75a562c-39ce-11ed-a5c4-d7f10ba73e16, e75a562d-39ce-11ed-a5c4-d7f10ba73e16
[2022-09-21 13:38:41,804: WARNING/ForkPoolWorker-31] need at least one array to concatenate
[2022-09-21 13:38:41,807: ERROR/ForkPoolWorker-31] Task AIWorker.call_train[bf756515-ecf8-4258-908e-981e3dd892f7] raised unexpected: Exception('[Epoch 1] error during training (reason: need at least one array to concatenate)')
Traceback (most recent call last):
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/aerial_wildlife_detection/modules/AIWorker/backend/worker/functional.py", line 292, in _call_train
    result = modelInstance.train(stateDict=stateDict, data=data, updateStateFun=update_state)
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/aerial_wildlife_detection/ai/models/detectron2/genericDetectronModel.py", line 447, in train
    dataLoader = build_detection_train_loader(
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/AIDEenv/lib/python3.8/site-packages/detectron2/config/config.py", line 210, in wrapped
    return orig_func(*args, **kwargs)
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/AIDEenv/lib/python3.8/site-packages/detectron2/data/build.py", line 422, in build_detection_train_loader
    dataset = DatasetFromList(dataset, copy=False)
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/AIDEenv/lib/python3.8/site-packages/detectron2/data/common.py", line 143, in __init__
    self._lst = np.concatenate(self._lst)
  File "<__array_function__ internals>", line 180, in concatenate
ValueError: need at least one array to concatenate

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/AIDEenv/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/AIDEenv/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/aerial_wildlife_detection/modules/AIWorker/backend/celery_interface.py", line 41, in call_train
    return worker.call_train(data[index], epoch, numEpochs, project, is_subset, aiModelSettings)
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/aerial_wildlife_detection/modules/AIWorker/app.py", line 216, in call_train
    return functional._call_train(project, data, epoch, numEpochs, subset, modelInstance, modelLibrary,
  File "/home/vlucet/Documents/WILDLab/repos/AIDE/aerial_wildlife_detection/modules/AIWorker/backend/worker/functional.py", line 295, in _call_train
    raise Exception(f'[Epoch {epoch}] error during training (reason: {str(e)})')
Exception: [Epoch 1] error during training (reason: need at least one array to concatenate)
cstldrones commented 1 year ago

I am also receiving this error. I've imported about 600 images and labeled about 150 of them with bounding boxes.

bkellenb commented 1 year ago

Hi! Thanks for raising the issue. This line:

[2022-09-21 13:38:41,804: WARNING/ForkPoolWorker-31] WARNING: encountered unknown label classes: e75a562b-39ce-11ed-a5c4-d7f10ba73e16, e75a562a-39ce-11ed-a5c4-d7f10ba73e16, 3a0ec21b-39d0-11ed-a5c4-d7f10ba73e16, e75a562c-39ce-11ed-a5c4-d7f10ba73e16, e75a562d-39ce-11ed-a5c4-d7f10ba73e16

Indicates that the model has not been adapted to work with the label classes. When you open the Model Marketplace and add a model to the project, you will eventually see a window that allows you to establish a mapping between the classes the model has been trained on and the ones you created in your annotation project. Here's an example with the MS-COCO classes (left list) and two classes ("Human", "Vehicle") in the project: Screenshot 2022-10-07 at 11 10 07

Any training images whose annotated label classes have not been mapped to the model's will be discarded during training. In your case this resulted in an empty list of images, hence the error.

You can do this step again as follows:

  1. Go to the project configuration page > "AI model" > "Settings" (URL: <project>/configuration/aimodel)
  2. You should see a table with all model states in the project. Tick them all, then click "Delete selected". Wait until confirmation message appears.
  3. Go to the Model Marketplace (URL: <project>/configuration/modelmarketplace), re-select the model of choice, click "Add to Project". Wait until done; that pop-up in the image above should appear. Assign project-to-model classes with the drop-down menus, then click "Save".
  4. Retry training (e.g., via the Workflow Designer)

I hope this helps! Otherwise let me know and we can try and debug it further.