Open neel04 opened 2 years ago
Also getting this, strangely enough after some modifications 🤔
FileNotFoundError: [Errno 2] No such file or directory: '/content/runs/train/yolor_p6/precision-recall_curve.png'
Did you solve this? I have the same problem, too. :(
I have the same problem, too. :(
In case anybody still encounters the same issue:
I fixed the initial issue of float indices by simply casting cls to integer where it is used as list index:
[...],
box_caption": "%s %.3f" % (names[int(cls)], conf), `
[...]
Before moving on, while we are in the W&B Logging part (lines 161-169):
The names
variable might need to be a dict
type for some versions of wandb
(apparently; at least for me there was an error). To fix the issue before it arises, edit the code to something like this:
# W&B logging
if plots and len(wandb_images) < log_imgs:
box_data = [{"position": {"minX": xyxy[0], "minY": xyxy[1], "maxX": xyxy[2], "maxY": xyxy[3]},
"class_id": int(cls),
"box_caption": "%s %.3f" % (names[int(cls)], conf),
"scores": {"class_score": conf},
"domain": "pixel"} for *xyxy, conf, cls in pred.tolist()]
# if necessary, create a dict using list indices as keys, so it can be queried almost exactly like a list
if type(names) == type([]):
names_dict = {idx:val for idx, val in enumerate(names)}
boxes = {"predictions": {"box_data": box_data, "class_labels": names_dict}}
else:
boxes = {"predictions": {"box_data": box_data, "class_labels": names}}
wandb_images.append(wandb.Image(img[si], boxes=boxes, caption=path.name))
The second issue (FileNotFoundError: [Errno 2] No such file or directory: 'runs/train/<run_dir>/precision-recall_curve.png'
) is the result of no validations being performed for less than 3 training epochs. This is defined in train.py
line 336 (if epoch >= 3:
).
If the test()
method from test.py
has not been called before, the relevant images are not prepared and thus non-existent.
At least that was the case for me. I assume you reduced your # epochs for test runs?
I encountered several more issues down the road. I had compatibility issues with PyTorch v1.12, which were easily resolved thanks to the code provided in #270.
I had to adjust the number of classes for my custom data, and subsequently the number of filters in several layers of the architecture as described in the respective <architecture>.cfg
files. Examples can be found in #16 and #251.
Finally, there was another issue in utils/plot.py
that kept me busy, but might also be a compatibility issue with PyTorch v1.12. I kept getting (illogical) errors for a list-type object that was somehow a CUDA Tensor, but should not be. Somewhere under the hood, some data is not properly converted. So in the method output_to_target()
(lines 89-108), the target
variable is not a simple list, but a CUDA Tensor (or includes CUDA Tensors). These MUST be moved to CPU memory. So I ended up editing the _tensor.py
file in my PyTorch installation.
The Tensor
class has a method __array__()
used for implicit type casting (lines 753-761 in my installation). I added the following code ahead of the if
-clauses handling the two possible return
statements:
if self.is_cuda:
self = self.cpu()
I hope that covers all issues you might have. I thought it would be good to write a small summary of my problems today, so others won't have to waste half a day. Have a good one! :)
Hi, thanks for such a great repo! 🤗 I wanted to train
YOLOr
on my own custom data. This is the command I am using, inColab
:-However, just after training finishes I get this error:-
This is a sample of how the bounding boxes in the dataset look like:
where
1
is the class index presumably.Does anyone know what may be the cause of this error?