Open MhmdDimassi opened 5 years ago
From what I've read, SIMRDWN requires the input to be either 416x416 or 544x544. Are the images you are working with already labelled? I'm guessing they are. xView (a dataset consisting of satellite images labelled with bounding boxes) has published official scripts for chipping the large satellite images up. You can look at that script to see how they went about splitting the labels up with the images. It consists of logical_ands and logical_ors to see which label falls within which chip. I'll provide a link to their repo. https://github.com/DIUx-xView/data_utilities
I've been attempting to work with the xView dataset and get SIMRDWN able to train on it, but I haven't had much luck yet. I'm working on reordering the tfrecords created by the xView script into the format needed by SIMRDWN.
Best of luck with figuring out how to create the chips!
Thank you for your response! I would like to ask you if you have a reference that share the recommended input size in SIMRDWN. Also, i have used xView data-set in SIMRDWN training, and i have at current moment an acceptable result. Note that i am working on yolt2, instead of SSD TensorFlow, and this is an example of testing.
So, if you have any problem and i can help you, feel free to ask me any question. Thanks,
The parse_cowc.py script parses out large images into smaller training windows. This is easily adaptable to xView in combination with the scripts in the xView repository. I'll try to put up some of my xView parsing scripts next week. In the meantime, here is an example of using SIMRDWN on xView.
@avanetten Thank you i thing this is very useful, If you can help me, i have another clarification question, Where in yolt2 config directory we have different cfg files (ave_dense, ave_dense32x32, ave_dense_darch, yolo.cfg, etc ) So, Can you propose a methodology to choose between it? Or do you suggest an appropriate one for my training (building detection) ? Thanks!
@avanetten If you are able to put up your xView parsing scripts, that would help tremendously. I am very new to tensorflow. I decided to try TF since I have been working with xView, and SIMRDWN looked like the best model to use. And I am having trouble parsing and formatting the xView images in the format SIMRDWN needs.
@MhmdDimassi was there any special formatting you needed to do before being able to run SIMRDWN on the xView images? I have the tfrecords the xView scripts produced, but as I said, I'm very new to tf, so I've never used a tfrecord or a model written in tf before.
Also, @MhmdDimassi what command did you use to run SIMRDWN on the xView images? I'm using
python simrdwn.py --framework yolt2 --mode train --outname yolt2_xview --yolt_cfg_file ave_standard.cfg --weight_file yolo.weights --label_map_path ../../../data_utilities/xview_class_labels.pbtxt --train_tf_record ../../../data_utilities/xview_train_t1.record --max_batches 30000 --batch_size 16 --gpu 5
which I'm not sure if I'm leaving needed arguments out. I used a mixture of the commands listed in the readme, but there are many more unused arguments.
When I run using that command, it (seemingly) hangs at the start saying
Num images = 16, i= 0 N ims: 0 Num iters: 30000
Sorry for so many questions, I'm just eager to get SIMRDWN working.
Hello, @mseals1, Don't worry about questions, i hope i can help you, First you should know that if you use yolt2 or yolt3, you don't need to use tfrecords, as listed in readme it is optional:
"Create .tfrecord (optional)
If the tensorflow object detection API models are being run, we must transform the training data into the .tfrecord format. This is accomplished via the simrdwn/core/preprocess_tfrecords.py script."
So transforming data into tfrecords, is not needed at least in yolt. And according to that my training command it is like:
python /simrdwn/core/simrdwn.py \ --framework yolt2 \ --mode train \ --outname xxxxx \ --yolt_cfg_file ave_dense.cfg \ --weight_dir /simrdwn/yolt2/input_weights \ --weight_file yolo.weights \ --yolt_train_images_list_file ImagesList.txt \ --label_map_path /simrdwn/data/class_labels_building.pbtxt \ --nbands 3 \ --max_batches 60000 \ --batch_size 4 \ --subdivisions 1 \ --gpu 0
Note that you should be careful about the paths, especially "yolt_train_images_list_file", should be in "/simrdwn/train_data/", and you can add different additional arguments.
This is very helpful, thank you! How do you go about transforming xView into YOLO format? The ImagesList.txt
and class_labels_building.pbtxt
. That's the only thing I don't really understand at this point. The official xView scripts only parse the images and labels into a tfrecords file unless I missed the script that actually splits images and labels into individual files.
I have split xview data using my own code, so please show me an example of your labels file .txt that you have (few lines is enough).
On Tue, Jul 9, 2019, 4:26 PM mseals1 notifications@github.com wrote:
This is very helpful, thank you! How do you go about transforming xView into YOLO format? The ImagesList.txt and class_labels_building.pbtxt. That's the only thing I don't really understand at this point. The official xView scripts only parse the images and labels into a tfrecords file unless I missed the script that actually splits images and labels into individual files.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/avanetten/simrdwn/issues/51?email_source=notifications&email_token=AKB33QA6ODJM4TAPMLA77H3P6SGYPA5CNFSM4H3Y5VXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQIFUY#issuecomment-509641427, or mute the thread https://github.com/notifications/unsubscribe-auth/AKB33QEMX7XH6X5HHEVG37TP6SGYPANCNFSM4H3Y5VXA .
Currently I just have the geojson, which contains all of the xView label info. So, should I split all the xView images into 416x416 or 544x544, then write a script using the official xView label splitting script to split them based on the chip size and write those to txt files based on the chipped up images' names?
The geojson is the one that comes with the dataset when you download it.
The 416 and 544 image sizes are the ones listed in the README for SIMRDWN.
Yes this is! And after you create all labels files .txt from geojson you can use the code convert.py to convert the coordinates into YOLO format. You can find the code in this link: https://github.com/ManivannanMurugavel/YOLO-Annotation-Tool/blob/master/convert.py
Thank you so much! This is extremely helpful. I realized that since I already have the chipped up images in tfrecords, I can use the tfrecords to write out the chipped images and labels to text files, then use the convert function in your code to convert the already-chipped-up labels to YOLO format and use that for SIMRDWN input. I'll try to get that working today, and I'll let you know how it goes.
Thank you again! You have been so helpful.
Welcome, and good luck!!
On Tue, Jul 9, 2019 at 6:29 PM mseals1 notifications@github.com wrote:
Thank you so much! This is extremely helpful. I realized that since I already have the chipped up images in tfrecords, I can use the tfrecords to write out the chipped images and labels to text files, then use the convert function in your code to convert the already-chipped-up labels to YOLO format and use that for SIMRDWN input. I'll try to get that working today, and I'll let you know how it goes.
Thank you again! You have been so helpful.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/avanetten/simrdwn/issues/51?email_source=notifications&email_token=AKB33QAN6ZKU5IAL2VXLZKDP6SVERA5CNFSM4H3Y5VXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQUHXQ#issuecomment-509690846, or mute the thread https://github.com/notifications/unsubscribe-auth/AKB33QH6EKBXP5N7IVVQYSDP6SVERANCNFSM4H3Y5VXA .
-- Mohammad Dimassi Phone: 71 87 80 81
Just an update: I did not get xView working with SIMRDWN. I was trying to read the tfrecords I had already made and writing those out to individual files (image and label for each chip), and it took more than 48 hours, and I was running it on a 20 image subsample of the dataset. I don't know enough about TF to try to speed it up. It was using part of the gpu.
Since that wasn't feasible at all, I switched to using the DOTA dataset. It already has a YOLOv2 model that works on the dataset. I'm currently preparing the dataset correctly to get it working with YOLOv2. Once I get that working, the formatted dataset should work with SIMRDWN, since it uses the YOLOv2 formatting.
I originally chipped DOTA into 1024x1024 chips, which was giving me a seg fault running on a Tesla K80 gpu. So, right now, I'm chipping it into 416x416 mages to test what batch size I can use with smaller image chips.
I know this is sort of off topic, but wanted to give an update of what I'm trying.
Hello, I am working on building detection from satellite images, and i think docker/yolt is the bestfit for my project, but i work on large images (9351,9351) pixels, so does this images should be an input for the training phase, or should segment it into smaller one? If yes, can you propose any annotation tools (for labeling) can handle this large images? Thanks you!