Closed Hasandaoud closed 1 year ago
Hi @abu3ali , you will need to make sure that your dataset looks like Pascal VOC format and directory structure:
- Annotations/
- *.xml
- ImageSets/
- Main
- test.txt
- train.txt
- trainval.txt
- val.txt
- JPEGImages/
- *.jpg
- labels.txt
It looks like right now you are missing the test.txt, train.txt, trainval.txt, val.txt
under ImageSets/Main
.
You can create a text file with your list of ImageIDs and copy it to those text files. It should look something like this, where each line is an ImageID (without the .jpg extension):
20200917-162237
20200917-162252
20200917-162314
20200917-162332
20200917-162351
20200917-162414
20200917-162429
20200917-162447
...
You also need to create your labels.txt file.
It seems that LabelImg tool doesn't create the full structure, but the CVAT tool is closer, so I have changed the docs to recommend CVAT going forward. However with moving your annotations/images around and re-creating the VOC directory structure, LabelImg works fine too.
When in doubt, you can download the original Pascal VOC dataset and look how it is structured.
Indeed, the labelImg tool does not create the full file structure. You need to do the following:
First create a label.txt file in the jetson-inference/python/training/detection/ssd directory (this labels.txt file should NOT have the BACKGROUND class listed, just the classes you want to train on e.g., dog, cat, rhino)
Place the following sub-directories populated by the labelImage tool outputs in the same directory: Annotations, JPEGImages
Then enter the following command from the ssd directory: python vision/datasets/generate_vocdata.py ./labels.txt
Now you are ready to train with the mb1 pretrained network: python3 train_ssd.py --dataset-type=voc --model-dir=models/my-models-voc --data=./ --pretrained-ssd='models/mobilenet-v1-ssd-mp-0_675.pth' --batch-size=4 --num-epochs=50
Then convert your trained model to onyx (delete the labels.txt file in the ssd directory since the above step creates for you a labels.txt file in the directory specified by --input : python3 onnx_export.py --input="./models/my-models-voc/name-of-model-you-want-to-convert.pth" --model-dir=models/my-models-voc
Good luck!
Hello @dusty-nv and @silent-code Thanks for the inputs. I was able to train the model and save it too ( I resized all the training images to 300x300 before the training ). But while converting to onyx, i'm getting the following error
The mismatch in size occurs while converting. In the code, the dummy_input was created usingdummy_input = torch.randn(args.batch-size, 3, args.height, args.width).cuda()
. Default height, width =300,300. So it doesn't make any sense as to why it was trying to convert to model with different tensor size. Any thoughts on this ?
Try deleting the labels.txt file created for training in the jetson-inference/python/training/detection/ssd directory. The onnx_export script will look in the models/your-model folder for the correct labels.txt file containing the BACKGROUND class listing.
@silent-code Thank you very much. That worked !!!!!
Glad to help!
Also remember, if you retrain afterward with more or different data for the same model, delete or rename the existing 'mb2-ssd-lite.onyx' and onyx engine file 'mb2-ssd-lite.onnx.1.1.7100.GPU.FP16.engine' in the models/your-model directory, before the onnx_export step. This will force detectnet to recompile the model at runtime with the new retrained weights.
Sure. Will keep that in mind.
Can you try setting IMAGES to this instead:
IMAGES=/home/jetson/jetson-inference/data/images
If there is still error, please post the terminal log here.
@dusty-nv , i'm having problem with running inference on camera stream. I'm running this command. python3 detectnet.py --model=models/mymodel/ssd-mobilenet.onnx --input-blob=input_0 --output-cvg=scores --output-bbox=boxes csi://0
. sometimes script just ends with Segmentation fault (core dumped)
right at this line. Here is the console log log.txt. some times, the stream opens and as soon as any detection is found, it stops saying Segementation fault.
While the inference runs well when image/set of images are given. Is this related to image format coming from the camera?? I tried using usb camera too but of no use as it also exhibit same behaviour as of that of csi camera inference.. Any ways to resolve this ?
@07hokage i have the same issue. Did you solve it? Thanks
@mjack3 is it regarding the inference on the live video stream ?
hi! I get the same error. after following help from here, I still have issues.
I'm using a custom data set, and had same issue as OP where my directory was not proper format.
I followed @dusty-nv fix, made proper directories, added .txt files.
however, I cannot figure out how to populate the labels.txt with the list of ImageIDs ?? I'm super noob :\
is there a command from terminal I can run to do this?
once done, I copy and paste this list into the other .txt docs in ImageSet/Main ?
thank you!!!
*I used RectLabel to create my data set, not LabelImg
hi! I get the same error. after following help from here, I still have issues.
I'm using a custom data set, and had same issue as OP where my directory was not proper format.
I followed @dusty-nv fix, made proper directories, added .txt files.
however, I cannot figure out how to populate the labels.txt with the list of ImageIDs ?? I'm super noob :\ is there a command from terminal I can run to do this?
once done, I copy and paste this list into the other .txt docs in ImageSet/Main ?
thank you!!!
*I used RectLabel to create my data set, not LabelImg
Labels.txt is supposed to contain the names of the classes that you are traning the model to recognise them. And not the image ids
hi! I get the same error. after following help from here, I still have issues. I'm using a custom data set, and had same issue as OP where my directory was not proper format. I followed @dusty-nv fix, made proper directories, added .txt files. however, I cannot figure out how to populate the labels.txt with the list of ImageIDs ?? I'm super noob : is there a command from terminal I can run to do this? once done, I copy and paste this list into the other .txt docs in ImageSet/Main ? thank you!!! *I used RectLabel to create my data set, not LabelImg
Labels.txt is supposed to contain the names of the classes that you are traning the model to recognise them. And not the image ids
yes, you're right.
I put my classes in labels.txt.
however, aren't I suppose to create a list of the ImageIDs and copy that to the .txt files in ImageSets/Main ? Right now, my test.txt etc are empty.
I get this error:
user@jetson: /Downloads/jetson-inference/python/training/detection/ssd$ python3 train_ssd.py --dataset-type=voc --data=/Downloads/jetson-inference/python/training/detection/ssd/data/shapes --model-dir=models/shapes
2021-07-21 21:33:07 - Using CUDA...
2021-07-21 21:33:07 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=4, checkpoint_folder='models/', dataset_type='voc', datasets=['/Downloads/jetson-inference/python/training/detection/ssd/data/shapes'], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones='80,100', momentum=0.9, net='mb1-ssd', num_epochs=30, num_workers=2, pretrained_ssd='models/mobilenet-v1-ssd-mp-0_675.pth', resume=None, scheduler='cosine', t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005)
2021-07-21 21:33:07 - Prepare training datasets.
Traceback (most recent call last):
File "train_ssd.py", line 214, in
@goodmorningcoffee yes you need a file containing the list of imageIDs under ImageSets/Main
As a shortcut, you can create just ImageSets/Main/default.txt
with all the imageID's, and this will be used for train/test/ect
If your image filenames / ID's are consistent, you can probably make a bash script that creates the imageIDs files for you. Typically the imageIDs are the image filenames without the file extension
When in doubt, download the original Pascal VOC dataset and inspect it's structure and how it's layed out.
@dusty-nv mi hermano como esta voy hacer el entrenamiento y me sale este error y no se porque sera si me puedes guiar de como resolverlo
Can you try running export OPENBLAS_CORETYPE=ARMV8
first?
or change the numpy version
@dusty-nv no entiendo en que parte debo correr esta linea de codigo export OPENBLAS_CORETYPE=ARMV8
Hi @camilofernandez9405, you should run export OPENBLAS_CORETYPE=ARMV8
in your terminal before you run your python3 train_ssd.py
command
@dusty-nv ya pude solucionar el error pero ahora tengo este nuevo error que no se que podria ser
The path to your dataset is incorrect or your dataset is missing the image list files under ImageSets/Main
@dusty-nv no entiendo si he descargado las imagenes con este comando python3 open_iamges_downloader.py --max-images=5000 --class-name "insect" --data=data/prueba adjunto la imagen de los archivos que se descargan y donde luego corro el modelo y me arroja el error
Hi, we can help me please. I run jetson-train-main: !python train_ssd.py --dataset-type=voc --data=data --model-dir=data --batch-size=32 --workers=2 --epochs=100
The bbox in the Pascal VOC format or all coordinates is in their fractional form.
And I have this error:
@PARPedraza it appears that somehow it is loading the labels.txt that was already exported from train_ssd.py, not your original labes.txt from the dataset:
VOC Labels read from file: (`BACKGROUND`, `0`, `1`, ...ect)
The BACKGROUND
should not be in the labels.txt that is in your dataset's folder. That BACKGROUND
class gets added by train_ssd.py and saved along with your model's folder. It should not be in the labels.txt that gets loaded by train_ssd.py
I deleted the labels.txt files that were exported in other trainings, but the same error persists and I can't find the solution.
I move the batch-size and the process allows me to obtain at least one epoch.
Is your --model-dir
the same as your --data
folder? It should be in a different folder. The labels.txt still shows multiple BACKGROUND
classes in it. Your original data/labels.txt
file shouldn't have BACKGROUND
in it.
Hi Dusty, thanks for your reply! I'm using CVAT to generate the annotation images and I exported the image in VOC format. I do see a "default.txt" file under /ImageSets/Main. But I still have this issue. Here is the log:
wzy@wzy-desktop:~/jetson-inference/python/training/detection/ssd$ python3 train_ssd.py --dataset-type=voc --data=data/food_bin1 --model-dir=models/food_bin --batch-size=1 --num-workers=1 --num-epochs=1 2024-04-24 15:00:51 - Using CUDA... 2024-04-24 15:00:51 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=1, checkpoint_folder='models/food_bin', dataset_type='voc', datasets=['data/food_bin1'], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, log_level='info', lr=0.01, mb2_width_mult=1.0, milestones='80,100', momentum=0.9, net='mb1-ssd', num_epochs=1, num_workers=1, pretrained_ssd='models/mobilenet-v1-ssd-mp-0_675.pth', resolution=300, resume=None, scheduler='cosine', t_max=100, use_cuda=True, validation_epochs=1, validation_mean_ap=False, weight_decay=0.0005) 2024-04-24 15:02:04 - model resolution 300x300 2024-04-24 15:02:04 - SSDSpec(feature_map_size=19, shrinkage=16, box_sizes=SSDBoxSizes(min=60, max=105), aspect_ratios=[2, 3]) 2024-04-24 15:02:04 - SSDSpec(feature_map_size=10, shrinkage=32, box_sizes=SSDBoxSizes(min=105, max=150), aspect_ratios=[2, 3]) 2024-04-24 15:02:04 - SSDSpec(feature_map_size=5, shrinkage=64, box_sizes=SSDBoxSizes(min=150, max=195), aspect_ratios=[2, 3]) 2024-04-24 15:02:04 - SSDSpec(feature_map_size=3, shrinkage=100, box_sizes=SSDBoxSizes(min=195, max=240), aspect_ratios=[2, 3]) 2024-04-24 15:02:04 - SSDSpec(feature_map_size=2, shrinkage=150, box_sizes=SSDBoxSizes(min=240, max=285), aspect_ratios=[2, 3]) 2024-04-24 15:02:04 - SSDSpec(feature_map_size=1, shrinkage=300, box_sizes=SSDBoxSizes(min=285, max=330), aspect_ratios=[2, 3]) 2024-04-24 15:02:04 - Prepare training datasets. Traceback (most recent call last): File "train_ssd.py", line 263, in <module> target_transform=target_transform) File "/home/wzy/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py", line 44, in __init__ raise IOError(f"missing ImageSet file {image_sets_file}") OSError: missing ImageSet file data/food_bin1/ImageSets/Main/trainval.txt
Then I manually created four .txt file:
test.txt; train.txt; trainval.txt; val.txt;
and I copy the image IDs from the default.txt and paste them to all four new files. But still, I got the same error. Can you please help me out? Thank you so much in advance!
EDIT: I found my mistake, the path to the data is wrong. There is another directory between food_bin1 and ImageSets. But I still have a question. Can I manually create four .txt files? If yes, then should I copy all 65 image IDs to those four .txt file or I have to split maybe 20+20+20+5 to different .txt file? Thanks!
Hi @yuwanzi123, glad you got it working - yes, you can create the 4 separate files and split the dataset between them (well, except that trainval.txt
is just train+val splits). Normally the split is like 70% train, 15% val, 15% test. However your dataset is very small so I probably wouldn't split it up, until you collect more data.
Got it, thank you so much Dusty!
hi @dusty-nv i want to train my own dataset to detect emotions using ssd-mobilenet, iam using labelimg to label my images (pascal voc) i put my dataset in ssd/data command to run training: python3 train_ssd.py --dataset-type=voc --data=data/14emotions --model-dir=models/14emotion --epochs=10
output: 2020-11-06 18:40:47 - Using CUDA... 2020-11-06 18:40:47 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=4, checkpoint_folder='models/14emotion', dataset_type='voc', datasets=['data/14emotions'], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, lr=0.01, mb2_width_mult=1.0, milestones='80,100', momentum=0.9, net='mb1-ssd', num_epochs=10, num_workers=2, pretrained_ssd='models/mobilenet-v1-ssd-mp-0_675.pth', resume=None, scheduler='cosine', t_max=100, use_cuda=True, validation_epochs=1, weight_decay=0.0005) 2020-11-06 18:40:47 - Prepare training datasets. Traceback (most recent call last): File "train_ssd.py", line 214, in
target_transform=target_transform)
File "/home/daoud/jetson-inference/python/training/detection/ssd/vision/datasets/voc_dataset.py", line 33, in init
raise IOError("missing ImageSet file {:s}".format(image_sets_file))
TypeError: unsupported format string passed to PosixPath.format
what should i do to solve it? thank you.