Closed rileybolen closed 1 month ago
i checked the metadata from previous issue. it also seems to be missing area
. so it will again fail at some point.
i checked the metadata from previous issue. it also seems to be missing
area
. so it will again fail at some point.
@abhishekkrthakur is this a bug or an issue with my data?
this is indeed, yet again :(, an issue that im looking into atm. but, your metadata.jsonl also seems to be missing "area" inside objects column. we need: area, bbox and category. an example dataset is here: https://huggingface.co/datasets/keremberke/license-plate-object-detection?row=1
@abhishekkrthakur Okay, thanks! I will add that to my data. Should the area just be the bounding box width*height? And I noticed that in this example dataset the category column is called category
but in the documentation I am looking at it is called categories
, which one is correct? https://huggingface.co/docs/autotrain/v0.7.104/object_detection
please follow the format in the link i provided your category names are fine. they should be strings. area needs to be calculated from bboxes. the format should be coco. ill update the docs.
object detection was recently added. i apologize you faced so many issues. ive also fixed the latest one now.
@abhishekkrthakur No problem! I realized I am missing the id
as well, should that just be a unique integer for each bounding box?
id is not used, you can skip it :)
@abhishekkrthakur I noticed a new error, do you think this will be fixed when you merge your new changes or should I open a new issue for this?
Downloading data: 0%| | 0/802 [00:00<?, ?files/s]
Downloading data: 100%|██████████| 802/802 [00:00<00:00, 438341.39files/s]
Downloading data: 0%| | 0/203 [00:00<?, ?files/s]
Downloading data: 100%|██████████| 203/203 [00:00<00:00, 24914.96files/s]
Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 574 examples [00:00, 5724.92 examples/s]
Generating train split: 799 examples [00:00, 5898.87 examples/s]
Generating validation split: 0 examples [00:00, ? examples/s]
Generating validation split: 200 examples [00:00, 6323.82 examples/s]
Saving the dataset (0/1 shards): 0%| | 0/799 [00:00<?, ? examples/s]
Saving the dataset (0/1 shards): 100%|██████████| 799/799 [00:00<00:00, 5256.74 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 799/799 [00:00<00:00, 5256.74 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 799/799 [00:00<00:00, 5231.23 examples/s]
Saving the dataset (0/1 shards): 0%| | 0/200 [00:00<?, ? examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 200/200 [00:00<00:00, 5590.54 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 200/200 [00:00<00:00, 5556.03 examples/s]
INFO | 2024-05-22 19:41:53 | autotrain.backends.local:create:8 - Starting local training...
INFO | 2024-05-22 19:41:53 | autotrain.commands:launch_command:372 - ['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.object_detection', '--training_config', 'autotrain-zny6w-3b288/training_params.json']
INFO | 2024-05-22 19:41:53 | autotrain.commands:launch_command:373 - {'data_path': 'autotrain-zny6w-3b288/autotrain-data', 'model': 'facebook/detr-resnet-101', 'username': 'rileybol', 'lr': 5e-05, 'epochs': 3, 'batch_size': 8, 'warmup_ratio': 0.1, 'gradient_accumulation': 1, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'train_split': 'train', 'valid_split': 'validation', 'logging_steps': -1, 'project_name': 'autotrain-zny6w-3b288', 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'save_total_limit': 1, 'token': '*****', 'push_to_hub': True, 'evaluation_strategy': 'epoch', 'image_column': 'autotrain_image', 'objects_column': 'autotrain_objects', 'log': 'tensorboard', 'image_square_size': 600, 'early_stopping_patience': 5, 'early_stopping_threshold': 0.01}
INFO | 2024-05-22 19:41:53 | autotrain.backends.local:create:13 - Training PID: 151
INFO: 10.16.41.118:34539 - "POST /ui/create_project HTTP/1.1" 200 OK
INFO: 10.16.41.118:2974 - "GET /ui/is_model_training HTTP/1.1" 200 OK
INFO: 10.16.15.199:19091 - "GET /ui/accelerators HTTP/1.1" 200 OK
The following values were not passed to `accelerate launch` and had defaults used instead:
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
INFO: 10.16.15.199:13209 - "GET /ui/is_model_training HTTP/1.1" 200 OK
INFO:matplotlib.font_manager:generated new fontManager
INFO | 2024-05-22 19:42:02 | __main__:train:83 - Train data: Dataset({
features: ['autotrain_image', 'autotrain_objects'],
num_rows: 799
})
INFO | 2024-05-22 19:42:02 | __main__:train:84 - Valid data: Dataset({
features: ['autotrain_image', 'autotrain_objects'],
num_rows: 200
})
ERROR | 2024-05-22 19:42:02 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):
File "/app/env/lib/python3.10/site-packages/autotrain/trainers/common.py", line 117, in wrapper
return func(*args, **kwargs)
File "/app/env/lib/python3.10/site-packages/autotrain/trainers/object_detection/__main__.py", line 86, in train
categories = train_data.features[config.objects_column].feature["category"].names
AttributeError: 'dict' object has no attribute 'feature'
ERROR | 2024-05-22 19:42:02 | autotrain.trainers.common:wrapper:121 - 'dict' object has no attribute 'feature'
can you share few lines from new metadata?
@abhishekkrthakur
{"file_name": "S05E01_1185266.jpg", "objects": {"bbox": [[275.7393, 75.5485, 183.911, 180.0954]], "category": ["Face"], "area": [33121.525109400005]}}
{"file_name": "S06E23_1050098.jpg", "objects": {"bbox": [[188.744, 88.5215, 219.7774, 189.2528]], "category": ["Face"], "area": [41593.48832672]}}
{"file_name": "S06E23_315748.jpg", "objects": {"bbox": [[214.69, 237.3291, 225.8824, 181.6216], [472.6232, 281.5898, 161.7806, 123.6248], [57.4881, 194.5946, 136.5978, 154.1494]], "category": ["Face", "Face", "Face"], "area": [41025.12289984, 20000.094318879997, 21056.468911320004]}}
{"file_name": "S06E01_403252.jpg", "objects": {"bbox": [[316.9475, 45.0238, 135.0715, 186.9634], [110.9062, 49.6025, 144.9921, 180.0954]], "category": ["Face", "Face"], "area": [25253.426883099997, 26112.41024634]}}
all the issues are now resolved. you need only whats in the updated docs. area is not needed. ive tested it locally too.
please make sure you are on latest version
@abhishekkrthakur Thanks! It looks like this issue is fixed, but I have now run into a different one during the training run, I will open a new issue for it.
Prerequisites
Backend
Hugging Face Space/Endpoints
Interface Used
UI
CLI Command
No response
UI Screenshots & Parameters
Error Logs
Downloading data: 0%| | 0/802 [00:00<?, ?files/s] Downloading data: 100%|██████████| 802/802 [00:00<00:00, 17761.12files/s]
Downloading data: 0%| | 0/203 [00:00<?, ?files/s] Downloading data: 100%|██████████| 203/203 [00:00<00:00, 21729.93files/s]
Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 629 examples [00:00, 6266.56 examples/s] Generating train split: 799 examples [00:00, 6202.62 examples/s]
Generating validation split: 0 examples [00:00, ? examples/s] Generating validation split: 200 examples [00:00, 5944.98 examples/s]
Saving the dataset (0/1 shards): 0%| | 0/799 [00:00<?, ? examples/s] Saving the dataset (0/1 shards): 100%|██████████| 799/799 [00:00<00:00, 5077.60 examples/s] Saving the dataset (1/1 shards): 100%|██████████| 799/799 [00:00<00:00, 5077.60 examples/s] Saving the dataset (1/1 shards): 100%|██████████| 799/799 [00:00<00:00, 5050.56 examples/s]
Saving the dataset (0/1 shards): 0%| | 0/200 [00:00<?, ? examples/s] Saving the dataset (1/1 shards): 100%|██████████| 200/200 [00:00<00:00, 5317.32 examples/s] Saving the dataset (1/1 shards): 100%|██████████| 200/200 [00:00<00:00, 5287.99 examples/s] INFO | 2024-05-22 18:49:36 | autotrain.backends.local:create:8 - Starting local training... INFO | 2024-05-22 18:49:36 | autotrain.commands:launch_command:372 - ['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.object_detection', '--training_config', 'autotrain-717ma-3oxi0/training_params.json'] INFO | 2024-05-22 18:49:36 | autotrain.commands:launch_command:373 - {'data_path': 'autotrain-717ma-3oxi0/autotrain-data', 'model': 'facebook/detr-resnet-101', 'username': 'rileybol', 'lr': 5e-05, 'epochs': 3, 'batch_size': 8, 'warmup_ratio': 0.1, 'gradient_accumulation': 1, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'train_split': 'train', 'valid_split': 'validation', 'logging_steps': -1, 'project_name': 'autotrain-717ma-3oxi0', 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'save_total_limit': 1, 'token': '**', 'push_to_hub': True, 'evaluation_strategy': 'epoch', 'image_column': 'autotrain_image', 'objects_column': 'autotrain_label', 'log': 'tensorboard', 'image_square_size': 600, 'early_stopping_patience': 5, 'early_stopping_threshold': 0.01} INFO | 2024-05-22 18:49:36 | autotrain.backends.local:create:13 - Training PID: 154 INFO: 10.16.2.201:24345 - "POST /ui/create_project HTTP/1.1" 200 OK INFO: 10.16.41.118:23391 - "GET /ui/is_model_training HTTP/1.1" 200 OK The following values were not passed to
accelerate launch
and had defaults used instead:--dynamo_backend
was set to a value of'no'
To avoid this warning pass in values for each of the problematic parameters or runaccelerate config
. INFO: 10.16.41.118:34820 - "GET /ui/is_model_training HTTP/1.1" 200 OK INFO: 10.16.41.118:24196 - "GET /ui/accelerators HTTP/1.1" 200 OK INFO:matplotlib.font_manager:generated new fontManager INFO: 10.16.15.199:26769 - "GET /ui/is_model_training HTTP/1.1" 200 OK INFO | 2024-05-22 18:49:45 | main:train:83 - Train data: Dataset({ features: ['autotrain_image', 'autotrain_objects'], num_rows: 799 }) INFO | 2024-05-22 18:49:45 | main:train:84 - Valid data: Dataset({ features: ['autotrain_image', 'autotrain_objects'], num_rows: 200 }) ERROR | 2024-05-22 18:49:45 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last): File "/app/env/lib/python3.10/site-packages/autotrain/trainers/common.py", line 117, in wrapper return func(args, kwargs) File "/app/env/lib/python3.10/site-packages/autotrain/trainers/object_detection/main.py", line 86, in train categories = train_data.features[config.objects_column].feature["category"].names KeyError: 'autotrain_label'ERROR | 2024-05-22 18:49:45 | autotrain.trainers.common:wrapper:121 - 'autotrain_label'
Additional Information
It seems that my training process gets past the last error, but now I am running into this new error.