Closed johnlockejrr closed 1 month ago
If needed, I can upload my dataset.
I tested this on a different environment and I get the same error:
DocTR version: 0.9.1a0
TensorFlow version: N/A
PyTorch version: 2.4.1+cu121 (torchvision 0.19.1+cu121)
OpenCV version: 4.10.0
OS: Ubuntu 22.04.5 LTS
Python version: 3.10.12
Is CUDA available (TensorFlow): N/A
Is CUDA available (PyTorch): Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4070
Nvidia driver version: 560.94
cuDNN version: Could not collect
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from doctr.file_utils import is_tf_available, is_torch_available
>>> print(f"is_tf_available: {is_tf_available()}")
is_tf_available: False
>>> print(f"is_torch_available: {is_torch_available()}")
is_torch_available: True
>>>
I think I made a mistake, I just realized I used polygons from original images and the images in dataset were mogrified... checking
Working now, with big images much slower. What height or width would be recommanded for training?
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Validation set loaded in 0.4528s (67 samples in 34 batches)
Train set loaded in 0.07876s (540 samples in 270 batches)
Training loss: 0.658518: 78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 211/270 [03:23<00:47, 1.24it/s]
EDIT: worked until killed:
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Validation set loaded in 0.4528s (67 samples in 34 batches)
Train set loaded in 0.07876s (540 samples in 270 batches)
Training loss: 0.643471: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [03:52<00:00, 1.16it/s]100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:33<00:00, 1.00it/s]
Validation loss decreased inf --> 2.29916: saving state...
Epoch 1/5 - Validation loss: 2.29916 (Recall: 1.67% | Precision: 5.47% | Mean IoU: 9.00%)
0%| | 0/270 [00:20<?, ?it/s]
Traceback (most recent call last): | 0/270 [00:00<?, ?it/s]
File "/home/incognito/doctr/references/detection/train_pytorch.py", line 481, in <module>
main(args)
File "/home/incognito/doctr/references/detection/train_pytorch.py", line 388, in main
fit_one_epoch(model, train_loader, batch_transforms, optimizer, scheduler, amp=args.amp)
File "/home/incognito/doctr/references/detection/train_pytorch.py", line 109, in fit_one_epoch
for images, targets in pbar:
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1327, in _next_data
idx, data = self._get_data()
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1283, in _get_data
success, data = self._try_get_data()
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1131, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/usr/lib/python3.10/queue.py", line 180, in get
self.not_empty.wait(remaining)
File "/usr/lib/python3.10/threading.py", line 324, in wait
gotit = waiter.acquire(True, timeout)
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/signal_handling.py", line 67, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 44881) is killed by signal: Killed.
Images are resized internally :)
Try to reduce/set the workers with --workers=<INT_DEPENDING_ON_YOU_MACHINE>
I just resized the images to x960 and recalculated the the polygons and everything goes smooth, anyway my dataset is at line level, I give it a try :)
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Validation set loaded in 0.1393s (67 samples in 34 batches)
Train set loaded in 0.07748s (540 samples in 270 batches)
Training loss: 1.3698: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:34<00:00, 2.86it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:13<00:00, 2.55it/s]
Validation loss decreased inf --> 0.674124: saving state...
Epoch 1/5 - Validation loss: 0.674124 (Recall: 4.78% | Precision: 3.38% | Mean IoU: 5.00%)
Training loss: 0.711258: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:28<00:00, 3.05it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:09<00:00, 3.48it/s]
Epoch 2/5 - Validation loss: 0.817873 (Recall: 5.47% | Precision: 2.30% | Mean IoU: 3.00%)
Training loss: 0.563128: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:26<00:00, 3.11it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:09<00:00, 3.52it/s]
Validation loss decreased 0.674124 --> 0.632917: saving state...
Epoch 3/5 - Validation loss: 0.632917 (Recall: 16.05% | Precision: 32.59% | Mean IoU: 29.00%)
Training loss: 0.610216: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:27<00:00, 3.07it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:09<00:00, 3.50it/s]
Epoch 4/5 - Validation loss: 0.642417 (Recall: 21.75% | Precision: 11.35% | Mean IoU: 9.00%)
Training loss: 0.604278: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:27<00:00, 3.09it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:09<00:00, 3.49it/s]
Validation loss decreased 0.632917 --> 0.565686: saving state...
Epoch 5/5 - Validation loss: 0.565686 (Recall: 43.27% | Precision: 46.25% | Mean IoU: 36.00%)
You should train longer :D But for only 5 epochs the metrics doesn't looks wrong :+1:
Yes! I just wanted to be sure it runs, was a first test, I'm happy with it anyway.
Just figuring how to add my new trained (*ish) model to the streamlit demo app :-|
EDIT: besides my datasets are line-level, I have another problem: my datasets are mostly RTL, should I do anything for it to work (like python bidi etc.)? Is, let's say Arabic or Hebrew requiring other features?
Yes! I just wanted to be sure it runs, was a first test, I'm happy with it anyway.
Just figuring how to add my new trained (*ish) model to the streamlit demo app :-|
Curious to see how well this can work ^^
Currently we use anyascii
(https://github.com/anyascii/anyascii) i think this should work !? :)
Yes! I just wanted to be sure it runs, was a first test, I'm happy with it anyway. Just figuring how to add my new trained (*ish) model to the streamlit demo app :-|
Curious to see how well this can work ^^
Currently we use
anyascii
(https://github.com/anyascii/anyascii) i think this should work !? :)
Never used it, yes, I think it should.
Seems I can't load it as per https://mindee.github.io/doctr/using_doctr/custom_models_training.html
:)
You can :) You have to change the vocab with --vocab=..
See here for the predefined vocabs we have: https://github.com/mindee/doctr/blob/main/doctr/datasets/vocabs.py
The vocab should contain all the chars you have in your dataset (or more)
Oh, sorry, I'm new to it. I mostly trained kraken
yolov8
and DocUFCN
models.
But it needs a vocab for a detection model? I didn't train a recognition model yet.
If no of the predefined vocabs should fit you can simply change:
to vocab="abc" for example but to load the model later you need the same string which defines your models vocab :)
@johnlockejrr No only for the recognition model training
Couldn't I load only the detection model to see how it performs on a new test image?
If no of the predefined vocabs should fit you can simply change:
to vocab="abc" for example but to load the model later you need the same string which defines your models vocab :)
I just take a look at vocabs.py
and for VOCABS["hebrew"]
there are more characters, the file should be amended sometime in the future.
If no of the predefined vocabs should fit you can simply change: https://github.com/mindee/doctr/blob/df762ed90010db4df9f4cb5692b52c2a2e5dc819/references/recognition/train_pytorch.py#L189
to vocab="abc" for example but to load the model later you need the same string which defines your models vocab :)
I just take a look at
vocabs.py
and forVOCABS["hebrew"]
there are more characters, the file should be amended sometime in the future.
Feel free to open a PR to add the missing chars :+1:
Can't load only the detection model to see how it performs on a new test image?
Sure :)
Load your custom trained model (in combination with the ocr_predictor
):
# Load custom detection model
det_model = db_resnet50(pretrained=False, pretrained_backbone=False)
det_params = torch.load('<path_to_pt>', map_location="cpu")
det_model.load_state_dict(det_params)
predictor = ocr_predictor(det_arch=det_model, reco_arch="vitstr_small", pretrained=True)
or only with the detection_predictor
:
import requests
import cv2
import numpy as np
import torch
from doctr.io import DocumentFile
from doctr.models import detection_predictor, db_resnet50
from doctr.utils.geometry import detach_scores
# Convert relative coordinates to absolute pixel values
def _to_absolute(geom, img_shape: tuple[int, int]) -> list[list[int]]:
h, w = img_shape
if len(geom) == 2: # Assume straight pages = True -> [[xmin, ymin], [xmax, ymax]]
(xmin, ymin), (xmax, ymax) = geom
xmin, xmax = int(round(w * xmin)), int(round(w * xmax))
ymin, ymax = int(round(h * ymin)), int(round(h * ymax))
return [[xmin, ymin], [xmax, ymin], [xmax, ymax], [xmin, ymax]]
else: # For polygons, convert each point to absolute coordinates
return [[int(point[0] * w), int(point[1] * h)] for point in geom]
url = "https://www.francetvinfo.fr/pictures/uGwaNE-aJq7zHLhZJdzdCd9nyjE/1200x900/2021/03/16/phpCDwGn0.jpg"
# Load custom detection model
det_model = db_resnet50(pretrained=False, pretrained_backbone=False)
det_params = torch.load('<path_to_pt>', map_location="cpu")
det_model.load_state_dict(det_params)
det_predictor = detection_predictor(
arch=det_model,
pretrained=False,
assume_straight_pages=True,
symmetric_pad=True,
preserve_aspect_ratio=True,
) #.cuda().half() # Uncomment this line if you have a GPU
det_predictor.model.postprocessor.bin_thresh = 0.3
det_predictor.model.postprocessor.box_thresh = 0.65
docs = DocumentFile.from_images([requests.get(url).content])
results = det_predictor(docs)
image = cv2.imdecode(np.frombuffer(requests.get(url).content, np.uint8), cv2.IMREAD_COLOR)
for doc, res in zip(docs, results):
img_shape = (doc.shape[0], doc.shape[1])
# Detach the probability scores from the results
detached_coords, prob_scores = detach_scores([res.get("words")])
for i, coords in enumerate(detached_coords[0]):
coords = coords.reshape(2, 2).tolist() if coords.shape == (4, ) else coords.tolist()
# Convert relative to absolute pixel coordinates
points = np.array(_to_absolute(coords, img_shape), dtype=np.int32).reshape((-1, 1, 2))
# Draw the bounding box on the image
cv2.polylines(image, [points], isClosed=True, color=(255, 0, 0), thickness=2)
# Save the modified image with bounding boxes
cv2.imwrite("output.jpg", image)
Perfect! Thank you for all your help! I'll open a PR later today for a new language and ammend the Hebrew language.
Perfect! Thank you for all your help! I'll open a PR later today for a new language and ammend the Hebrew language.
reference PR to show what's required to update or add a vocab: https://github.com/mindee/doctr/pull/1700/files
Very strage with my model. Executing your script above:
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python load_det_model.py
/home/incognito/doctr/load_det_model.py:27: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
det_params = torch.load('db_resnet50_20240930-142637.pt', map_location="cpu")
Traceback (most recent call last):
File "/home/incognito/doctr/load_det_model.py", line 28, in <module>
det_model.load_state_dict(det_params)
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DBNet:
size mismatch for prob_head.6.weight: copying a param with shape torch.Size([64, 2, 2, 2]) from checkpoint, the shape in current model is torch.Size([64, 1, 2, 2]).
size mismatch for prob_head.6.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([1]).
size mismatch for thresh_head.6.weight: copying a param with shape torch.Size([64, 2, 2, 2]) from checkpoint, the shape in current model is torch.Size([64, 1, 2, 2]).
size mismatch for thresh_head.6.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([1]).
Could this happen because I trained it on a line-level dataset?
Can your share on entry from your labels.json
you used for training ?
Can your share on entry from your
labels.json
you used for training ?
Sure:
{"81_dc946_default.jpg": {"img_dimensions": [720, 960], "img_hash": "f04698acbbc7246475a8401dc031facf1d152c156cb1363217270cd7591e94d3", "polygons": {"textzone": [[[66, 153], [527, 153], [527, 709], [66, 709]]], "textline": [[[78, 161], [515, 161], [515, 188], [78, 188]], [[76, 180], [515, 180], [515, 207], [76, 207]], [[79, 201], [515, 201], [515, 229], [79, 229]], [[77, 221], [514, 221], [514, 250], [77, 250]], [[78, 242], [516, 242], [516, 273], [78, 273]], [[73, 264], [516, 264], [516, 292], [73, 292]], [[75, 287], [517, 287], [517, 313], [75, 313]], [[76, 307], [517, 307], [517, 335], [76, 335]], [[73, 327], [518, 327], [518, 356], [73, 356]], [[75, 350], [516, 350], [516, 377], [75, 377]], [[76, 388], [518, 388], [518, 417], [76, 417]], [[77, 412], [519, 412], [519, 437], [77, 437]], [[74, 434], [518, 434], [518, 457], [74, 457]], [[75, 452], [518, 452], [518, 478], [75, 478]], [[78, 472], [518, 472], [518, 499], [78, 499]], [[81, 493], [519, 493], [519, 519], [81, 519]], [[81, 514], [518, 514], [518, 540], [81, 540]], [[73, 535], [519, 535], [519, 560], [73, 560]], [[74, 556], [519, 556], [519, 581], [74, 581]], [[72, 576], [519, 576], [519, 602], [72, 602]], [[74, 596], [519, 596], [519, 624], [74, 624]], [[75, 618], [517, 618], [517, 647], [75, 647]], [[73, 637], [521, 637], [521, 666], [73, 666]], [[79, 658], [520, 658], [520, 686], [79, 686]], [[75, 680], [520, 680], [520, 714], [75, 714]]]}}, "136_7aab7_default.jpg": {"img_dimensions": [720, 960], "img_hash": "eac91c1193e188f4dd089705086e3e3dfd6bc5233d5ceb714c6082684a64ab06", "polygons": {"textzone": [[[183, 174], [621, 174], [621, 722], [183, 722]]], "textline": [[[188, 181], [615, 181], [615, 211], [188, 211]], [[187, 206], [614, 206], [614, 231], [187, 231]], [[184, 226], [613, 226], [613, 252], [184, 252]], [[188, 246], [614, 246], [614, 274], [188, 274]], [[188, 268], [615, 268], [615, 291], [188, 291]], [[189, 287], [615, 287], [615, 315], [189, 315]], [[188, 308], [614, 308], [614, 335], [188, 335]], [[188, 329], [616, 329], [616, 355], [188, 355]], [[187, 349], [616, 349], [616, 375], [187, 375]], [[186, 372], [616, 372], [616, 397], [186, 397]], [[186, 390], [616, 390], [616, 417], [186, 417]], [[188, 429], [618, 429], [618, 455], [188, 455]], [[189, 450], [619, 450], [619, 477], [189, 477]], [[189, 471], [619, 471], [619, 498], [189, 498]], [[189, 491], [619, 491], [619, 517], [189, 517]], [[190, 512], [618, 512], [618, 538], [190, 538]], [[190, 533], [620, 533], [620, 558], [190, 558]], [[189, 553], [619, 553], [619, 577], [189, 577]], [[192, 574], [616, 574], [616, 599], [192, 599]], [[191, 594], [620, 594], [620, 620], [191, 620]], [[191, 613], [619, 613], [619, 638], [191, 638]], [[193, 633], [619, 633], [619, 660], [193, 660]], [[190, 655], [620, 655], [620, 680], [190, 680]], [[189, 673], [619, 673], [619, 700], [189, 700]], [[186, 694], [618, 694], [618, 729], [186, 729]]]}},
...
Better, I can upload the labels.json
of val because is smaller than train.
Ah i see you trained an KIE model :sweat_smile:
To train only a detection model polygons
shouldn't be a dict -- only the polygons as value like.
"polygons": [[[66, 153], [527, 153], [527, 709], [66, 709]], .....]
OMG! :)
OMG! :)
I think this wasn't planned right ? ^^
For a detection model can't I specify more class names? As I have textzone
s and textline
s
Or better I just remove the textzone
class and keep the textline
s?
For a detection model can't I specify more class names? As I have
textzone
s andtextline
s
You can also load this model with:
det_model = db_resnet50(pretrained=False, pretrained_backbone=False, class_names=['textzone', 'textline'])
det_params = torch.load('<path_to_pt>', map_location="cpu")
det_model.load_state_dict(det_params)
For a detection model can't I specify more class names? As I have
textzone
s andtextline
sYou can also load this model with:
det_model = db_resnet50(pretrained=False, pretrained_backbone=False, class_names=['textzone', 'textline']) det_params = torch.load('<path_to_pt>', map_location="cpu") det_model.load_state_dict(det_params)
Bad day :)
/home/incognito/doctr/load_det_model-kie.py:28: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
det_params = torch.load('db_resnet50_20240930-142637.pt', map_location="cpu")
Traceback (most recent call last):
File "/home/incognito/doctr/load_det_model-kie.py", line 50, in <module>
detached_coords, prob_scores = detach_scores([res.get("words")])
File "/home/incognito/doctr/doctr/utils/geometry.py", line 79, in detach_scores
loc_preds, obj_scores = zip(*(_detach(box) for box in boxes))
File "/home/incognito/doctr/doctr/utils/geometry.py", line 79, in <genexpr>
loc_preds, obj_scores = zip(*(_detach(box) for box in boxes))
File "/home/incognito/doctr/doctr/utils/geometry.py", line 75, in _detach
if boxes.ndim == 2:
AttributeError: 'NoneType' object has no attribute 'ndim'
I think I should re-train it :)
Error on line detached_coords, prob_scores = detach_scores([res.get("words")])
If is a KIE model shouldn't I from doctr.models import kie_predictor
?
I changed the line to detached_coords, prob_scores = detach_scores([res.get("textline")])
But I get nothing, script runs but no detections.
detached_coords
-> [array([], shape=(0, 4), dtype=float32)]
I reconverted my data to:
{"215_67426_default.jpg": {"img_dimensions": [720, 960], "img_hash": "f4da2a0dcdcd28dbc08609bac090f465ee5d7b471fa42024da0a11e79acade60", "polygons": [[[72, 162], [514, 162], [514, 194], [72, 194]], [[69, 188], [514, 188], [514, 216], [69, 216]], [[69, 209], [514, 209], [514, 238], [69, 238]], [[69, 231], [514, 231], [514, 259], [69, 259]], [[69, 251], [514, 251], [514, 283], [69, 283]], [[70, 274], [515, 274], [515, 299], [70, 299]], [[70, 293], [515, 293], [515, 322], [70, 322]], [[69, 314], [516, 314], [516, 340], [69, 340]], [[69, 335], [516, 335], [516, 364], [69, 364]], [[67, 355], [516, 355], [516, 386], [67, 386]], [[69, 392], [517, 392], [517, 427], [69, 427]], [[70, 420], [514, 420], [514, 447], [70, 447]], [[70, 441], [517, 441], [517, 468], [70, 468]], [[70, 462], [517, 462], [517, 493], [70, 493]], [[70, 483], [518, 483], [518, 511], [70, 511]], [[77, 504], [519, 504], [519, 534], [77, 534]], [[65, 526], [520, 526], [520, 555], [65, 555]], [[69, 547], [519, 547], [519, 578], [69, 578]], [[69, 570], [521, 570], [521, 598], [69, 598]], [[71, 590], [520, 590], [520, 619], [71, 619]], [[65, 612], [521, 612], [521, 642], [65, 642]], [[70, 635], [521, 635], [521, 663], [70, 663]], [[70, 660], [522, 660], [522, 684], [70, 684]], [[66, 677], [522, 677], [522, 703], [66, 703]], [[70, 698], [522, 698], [522, 727], [70, 727]], [[67, 716], [199, 716], [199, 741], [67, 741]]]}, "545_4408b_default.jpg": {"img_dimensions": [720, 960], "img_hash": "21c0f7326a7821b77b2a5e49e76017e60555dd40670005863a20a13d2803748d", "polygons": [[[107, 179], [507, 179], [507, 207], [107, 207]], [[107, 200], [510, 200], [510, 226], [107, 226]], [[105, 220], [509, 220], [509, 245], [105, 245]], [[109, 243], [510, 243], [510, 262], [109, 262]], [[106, 259], [510, 259], [510, 282], [106, 282]], [[106, 277], [510, 277], [510, 301], [106, 301]], [[106, 299], [510, 299], [510, 319], [106, 319]], [[103, 315], [510, 315], [510, 338], [103, 338]], [[103, 333], [510, 333], [510, 358], [103, 358]], [[101, 354], [510, 354], [510, 379], [101, 379]], [[104, 373], [509, 373], [509, 398], [104, 398]], [[101, 390], [510, 390], [510, 416], [101, 416]], [[103, 412], [511, 412], [511, 431], [103, 431]], [[104, 430], [511, 430], [511, 455], [104, 455]], [[101, 450], [510, 450], [510, 475], [101, 475]], [[104, 469], [510, 469], [510, 495], [104, 495]], [[104, 489], [509, 489], [509, 514], [104, 514]], [[104, 507], [510, 507], [510, 533], [104, 533]], [[104, 528], [510, 528], [510, 553], [104, 553]], [[103, 549], [511, 549], [511, 572], [103, 572]], [[103, 565], [509, 565], [509, 591], [103, 591]], [[103, 584], [511, 584], [511, 611], [103, 611]], [[101, 602], [511, 602], [511, 629], [101, 629]], [[99, 622], [512, 622], [512, 650], [99, 650]], [[105, 660], [512, 660], [512, 693], [105, 693]], [[103, 684], [202, 684], [202, 710], [103, 710]]]},
I'll retrain :)
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 10 --device 0
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=10, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Validation set loaded in 0.1427s (67 samples in 34 batches)
Train set loaded in 0.0208s (540 samples in 270 batches)
Training loss: 0.29681: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:06<00:00, 4.07it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:09<00:00, 3.59it/s]
Validation loss decreased inf --> 0.362736: saving state...
Epoch 1/10 - Validation loss: 0.362736 (Recall: 98.02% | Precision: 85.08% | Mean IoU: 65.00%)
Training loss: 0.321628: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.29it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.49it/s]
Epoch 2/10 - Validation loss: 0.372804 (Recall: 95.15% | Precision: 84.16% | Mean IoU: 63.00%)
Training loss: 0.406969: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:03<00:00, 4.24it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.43it/s]
Validation loss decreased 0.362736 --> 0.33441: saving state...
Epoch 3/10 - Validation loss: 0.33441 (Recall: 92.34% | Precision: 75.74% | Mean IoU: 52.00%)
Training loss: 0.508775: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.29it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.54it/s]
Epoch 4/10 - Validation loss: 0.354248 (Recall: 98.68% | Precision: 80.43% | Mean IoU: 64.00%)
Training loss: 0.389871: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:03<00:00, 4.28it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.54it/s]
Validation loss decreased 0.33441 --> 0.316777: saving state...
Epoch 5/10 - Validation loss: 0.316777 (Recall: 98.68% | Precision: 89.18% | Mean IoU: 70.00%)
Training loss: 0.36966: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.30it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.60it/s]
Validation loss decreased 0.316777 --> 0.308347: saving state...
Epoch 6/10 - Validation loss: 0.308347 (Recall: 97.19% | Precision: 81.19% | Mean IoU: 59.00%)
Training loss: 0.31847: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:03<00:00, 4.25it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.49it/s]
Validation loss decreased 0.308347 --> 0.285198: saving state...
Epoch 7/10 - Validation loss: 0.285198 (Recall: 98.08% | Precision: 87.41% | Mean IoU: 67.00%)
Training loss: 0.202373: 11%|███████████████████████▊ | 31/270 [00:08<01:05, 3.67it/s]
Traceback (most recent call last):███████████████████▊ | 31/270 [00:08<00:52, 4.53it/s]
File "/home/incognito/doctr/references/detection/train_pytorch.py", line 481, in <module>
main(args)
File "/home/incognito/doctr/references/detection/train_pytorch.py", line 388, in main
fit_one_epoch(model, train_loader, batch_transforms, optimizer, scheduler, amp=args.amp)
File "/home/incognito/doctr/references/detection/train_pytorch.py", line 109, in fit_one_epoch
for images, targets in pbar:
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1324, in _next_data
return self._process_data(data)
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
data.reraise()
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise
raise exception
UnboundLocalError: Caught UnboundLocalError in DataLoader worker process 15.
Original Traceback (most recent call last):
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/incognito/doctr/doctr/datasets/datasets/base.py", line 67, in __getitem__
img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
File "/home/incognito/doctr/doctr/transforms/modules/base.py", line 56, in __call__
x, target = t(x, target)
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/incognito/doctr/doctr/transforms/modules/pytorch.py", line 87, in forward
target[:, [0, 2]] = offset[0] + target[:, [0, 2]] * raw_shape[-1] / img.shape[-1]
UnboundLocalError: local variable 'offset' referenced before assignment
I resumed it and it finished:
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0 --resume ./db_resnet50_20240930-162432.pt --workers 2
Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=2, resume='./db_resnet50_20240930-162432.pt', test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01)
Validation set loaded in 0.1605s (67 samples in 34 batches)
Resuming ./db_resnet50_20240930-162432.pt
/home/incognito/doctr/references/detection/train_pytorch.py:228: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
checkpoint = torch.load(args.resume, map_location="cpu")
Train set loaded in 0.07673s (540 samples in 270 batches)
Training loss: 0.342384: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:04<00:00, 4.20it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:08<00:00, 4.06it/s]
Validation loss decreased inf --> 0.333333: saving state...
Epoch 1/5 - Validation loss: 0.333333 (Recall: 98.32% | Precision: 84.99% | Mean IoU: 64.00%)
Training loss: 0.285108: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.35it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.64it/s]
Validation loss decreased 0.333333 --> 0.298129: saving state...
Epoch 2/5 - Validation loss: 0.298129 (Recall: 97.84% | Precision: 90.08% | Mean IoU: 67.00%)
Training loss: 0.241384: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:01<00:00, 4.40it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.66it/s]
Validation loss decreased 0.298129 --> 0.234458: saving state...
Epoch 3/5 - Validation loss: 0.234458 (Recall: 98.80% | Precision: 81.85% | Mean IoU: 71.00%)
Training loss: 0.238148: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:01<00:00, 4.37it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.72it/s]
Epoch 4/5 - Validation loss: 0.238532 (Recall: 98.50% | Precision: 86.95% | Mean IoU: 75.00%)
Training loss: 0.237705: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.34it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.62it/s]
Validation loss decreased 0.234458 --> 0.20468: saving state...
Epoch 5/5 - Validation loss: 0.20468 (Recall: 98.98% | Precision: 89.64% | Mean IoU: 80.00%)
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 10 --device 0 Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=10, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=None, resume=None, test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01) Validation set loaded in 0.1427s (67 samples in 34 batches) Train set loaded in 0.0208s (540 samples in 270 batches) Training loss: 0.29681: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:06<00:00, 4.07it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:09<00:00, 3.59it/s] Validation loss decreased inf --> 0.362736: saving state... Epoch 1/10 - Validation loss: 0.362736 (Recall: 98.02% | Precision: 85.08% | Mean IoU: 65.00%) Training loss: 0.321628: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.29it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.49it/s] Epoch 2/10 - Validation loss: 0.372804 (Recall: 95.15% | Precision: 84.16% | Mean IoU: 63.00%) Training loss: 0.406969: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:03<00:00, 4.24it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.43it/s] Validation loss decreased 0.362736 --> 0.33441: saving state... Epoch 3/10 - Validation loss: 0.33441 (Recall: 92.34% | Precision: 75.74% | Mean IoU: 52.00%) Training loss: 0.508775: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.29it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.54it/s] Epoch 4/10 - Validation loss: 0.354248 (Recall: 98.68% | Precision: 80.43% | Mean IoU: 64.00%) Training loss: 0.389871: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:03<00:00, 4.28it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.54it/s] Validation loss decreased 0.33441 --> 0.316777: saving state... Epoch 5/10 - Validation loss: 0.316777 (Recall: 98.68% | Precision: 89.18% | Mean IoU: 70.00%) Training loss: 0.36966: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.30it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.60it/s] Validation loss decreased 0.316777 --> 0.308347: saving state... Epoch 6/10 - Validation loss: 0.308347 (Recall: 97.19% | Precision: 81.19% | Mean IoU: 59.00%) Training loss: 0.31847: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:03<00:00, 4.25it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:06<00:00, 5.49it/s] Validation loss decreased 0.308347 --> 0.285198: saving state... Epoch 7/10 - Validation loss: 0.285198 (Recall: 98.08% | Precision: 87.41% | Mean IoU: 67.00%) Training loss: 0.202373: 11%|███████████████████████▊ | 31/270 [00:08<01:05, 3.67it/s] Traceback (most recent call last):███████████████████▊ | 31/270 [00:08<00:52, 4.53it/s] File "/home/incognito/doctr/references/detection/train_pytorch.py", line 481, in <module> main(args) File "/home/incognito/doctr/references/detection/train_pytorch.py", line 388, in main fit_one_epoch(model, train_loader, batch_transforms, optimizer, scheduler, amp=args.amp) File "/home/incognito/doctr/references/detection/train_pytorch.py", line 109, in fit_one_epoch for images, targets in pbar: File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__ for obj in iterable: File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__ data = self._next_data() File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1324, in _next_data return self._process_data(data) File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data data.reraise() File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/_utils.py", line 706, in reraise raise exception UnboundLocalError: Caught UnboundLocalError in DataLoader worker process 15. Original Traceback (most recent call last): File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop data = fetcher.fetch(index) # type: ignore[possibly-undefined] File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/incognito/doctr/doctr/datasets/datasets/base.py", line 67, in __getitem__ img_transformed, target[class_name] = self.sample_transforms(img, bboxes) File "/home/incognito/doctr/doctr/transforms/modules/base.py", line 56, in __call__ x, target = t(x, target) File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/incognito/doctr/env-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, **kwargs) File "/home/incognito/doctr/doctr/transforms/modules/pytorch.py", line 87, in forward target[:, [0, 2]] = offset[0] + target[:, [0, 2]] * raw_shape[-1] / img.shape[-1] UnboundLocalError: local variable 'offset' referenced before assignment
I resumed it and it finished:
(env-py3.10) incognito@DESKTOP-NHKR7QL:~/doctr$ python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0 --resume ./db_resnet50_20240930-162432.pt --workers 2 Namespace(train_path='datasets/sam/train_out', val_path='datasets/sam/val_out', arch='db_resnet50', name=None, epochs=5, batch_size=2, device=0, save_interval_epoch=False, input_size=1024, lr=0.001, weight_decay=0, workers=2, resume='./db_resnet50_20240930-162432.pt', test_only=False, freeze_backbone=False, show_samples=False, wb=False, push_to_hub=False, pretrained=False, rotation=False, eval_straight=False, sched='poly', amp=False, find_lr=False, early_stop=False, early_stop_epochs=5, early_stop_delta=0.01) Validation set loaded in 0.1605s (67 samples in 34 batches) Resuming ./db_resnet50_20240930-162432.pt /home/incognito/doctr/references/detection/train_pytorch.py:228: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. checkpoint = torch.load(args.resume, map_location="cpu") Train set loaded in 0.07673s (540 samples in 270 batches) Training loss: 0.342384: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:04<00:00, 4.20it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:08<00:00, 4.06it/s] Validation loss decreased inf --> 0.333333: saving state... Epoch 1/5 - Validation loss: 0.333333 (Recall: 98.32% | Precision: 84.99% | Mean IoU: 64.00%) Training loss: 0.285108: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.35it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.64it/s] Validation loss decreased 0.333333 --> 0.298129: saving state... Epoch 2/5 - Validation loss: 0.298129 (Recall: 97.84% | Precision: 90.08% | Mean IoU: 67.00%) Training loss: 0.241384: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:01<00:00, 4.40it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.66it/s] Validation loss decreased 0.298129 --> 0.234458: saving state... Epoch 3/5 - Validation loss: 0.234458 (Recall: 98.80% | Precision: 81.85% | Mean IoU: 71.00%) Training loss: 0.238148: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:01<00:00, 4.37it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.72it/s] Epoch 4/5 - Validation loss: 0.238532 (Recall: 98.50% | Precision: 86.95% | Mean IoU: 75.00%) Training loss: 0.237705: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270/270 [01:02<00:00, 4.34it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:05<00:00, 6.62it/s] Validation loss decreased 0.234458 --> 0.20468: saving state... Epoch 5/5 - Validation loss: 0.20468 (Recall: 98.98% | Precision: 89.64% | Mean IoU: 80.00%)
That's a known issue PR to fix this is on the way :) https://github.com/mindee/doctr/pull/1715 CC @odulcy-mindee
It performs well (*ish). With your script above but any idea why identifies only one line?
It performs well (*ish). With your script above but any idea why identifies only one line?
What's the shape of the model output?
Btw in my provided script lower bin_thresh and box_thresh to 0.1
I trained the model on x960 images, when detecting I sould use the same resolution?
I trained the model on x960 images, when detecting I sould use the same resolution?
If you have resized it before on your own it would make sense yep
I resized the image to x960. I think it needs more training.
Bug description
While training a Doctr model with my own dataset, I encountered an UnboundLocalError in the compute_loss function of the differentiable_binarization module.
Code snippet to reproduce the bug
python references/detection/train_pytorch.py datasets/sam/train_out datasets/sam/val_out db_resnet50 --epochs 5 --device 0
Error traceback
Environment
DocTR version: 0.9.1a0 TensorFlow version: N/A PyTorch version: 2.4.1+cu121 (torchvision 0.19.1+cu121) OpenCV version: 4.10.0 OS: Ubuntu 22.04.5 LTS Python version: 3.10.12 Is CUDA available (TensorFlow): N/A Is CUDA available (PyTorch): Yes CUDA runtime version: Could not collect GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060 Nvidia driver version: 561.09 cuDNN version: Could not collect
Deep Learning backend