AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.67k stars 7.96k forks source link

Script to convert annotation xml files to .txt file #864

Open kmsravindra opened 6 years ago

kmsravindra commented 6 years ago

@AlexeyAB ,

I have used darkflow earlier and I wanted to make a switch to your library as this performs great with lot of options! So, I already have several annotations that were done in .xml...The content looks like the below - Is there a script than can simply convert it to .txt file without having to use Yolo_mark? Otherwise I can write a script , if you can tell me what I need to take care of. Your help is much appreciated! sample_xml

dexception commented 6 years ago

I am not sure how comfortable you are with Java. But here is the code:

    public static ArrayList<String> parseAndReturnData(String path) throws ParserConfigurationException, SAXException, IOException
    {
        ArrayList<String> boxList= new ArrayList<String>();

        File file = new File(path);
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document doc = db.parse(file);
        doc.getDocumentElement().normalize();

        System.out.println("Root element :" + doc.getDocumentElement().getNodeName());

        NodeList sizeList = doc.getElementsByTagName("size");
        int width = -1;
        int height = -1;

        System.out.println("Size tags:"+sizeList.getLength());
        NodeList sizeChildNodes = sizeList.item(0).getChildNodes();

        for(int j=0;j< sizeChildNodes.getLength();j++)
        {
            if(sizeChildNodes.item(j).getNodeName().equals("width"))
                width = Integer.parseInt(sizeChildNodes.item(j).getTextContent());

            if(sizeChildNodes.item(j).getNodeName().equals("height"))
                height = Integer.parseInt(sizeChildNodes.item(j).getTextContent());         
        }

        System.out.println("width:"+width);
        System.out.println("height:"+height);

        NodeList objectList = doc.getElementsByTagName("object");
        System.out.println("Object Length:"+objectList.getLength());

        for (int temp = 0; temp< objectList.getLength(); temp++)
        {
         Node objectNode = objectList.item(temp);
         if (objectNode.getNodeType() == Node.ELEMENT_NODE)
         {
          Element objectElement = (Element) objectNode;
          System.out.println("Current Node: "+objectElement.getNodeName());
          NodeList childObjectNodes = objectElement.getChildNodes();

          for(int z=0;z<childObjectNodes.getLength();z++)
          {
              Node childObjectNode = childObjectNodes.item(z);
              System.out.println(childObjectNode.getNodeName());

          if(childObjectNode.getNodeName().equals("bndbox"))
          {           
            NodeList bndboxChildNodes = childObjectNode.getChildNodes();

            int xmin=-1,ymin=-1,xmax=-1,ymax=-1;

            for(int temp2 = 0; temp2< bndboxChildNodes.getLength();temp2++)
            {
                Node bndboxChildNode = bndboxChildNodes.item(temp2);
                 if (bndboxChildNode.getNodeType() == Node.ELEMENT_NODE )
                 {
                     Element e2 = (Element) bndboxChildNode;
                     if(e2.getNodeName().equals("xmin"))
                         xmin = Integer.parseInt(e2.getTextContent());

                     else if(e2.getNodeName().equals("ymin"))
                         ymin = Integer.parseInt(e2.getTextContent());

                     else if(e2.getNodeName().equals("xmax"))
                         xmax = Integer.parseInt(e2.getTextContent());

                     else if(e2.getNodeName().equals("ymax"))
                         ymax = Integer.parseInt(e2.getTextContent());
                 }
            }

             if(xmin!=-1 && ymin!=-1 && xmax!=-1 && ymax!=-1)
             {
                 String qwerty = "0 "+convert(width, height, xmin, ymin, xmax, ymax);
                 boxList.add(qwerty);
             }
             else
             {
                 System.out.println("Could not find proper values");
                 System.out.println(path);
                 System.exit(1);
             }
          }
         }
        }
        }

        return boxList;
    }

    public static String convert(int width, int height, int xmin, int ymin, int xmax, int ymax)
    {
        double dw = 1.0/width;
        double dh = 1.0/height;
        double x = (xmin+xmax)/2.0;
        double y = (ymin+ymax)/2.0;
        double w = xmax - xmin;
        double h = ymax - ymin;
        x = x * dw;
        w = w * dw;
        y = y * dh;
        h = h * dh;
        return x+" "+y+" "+w+" "+h; 
    }
AlexeyAB commented 6 years ago

Your xml-file looks like default format such as in the Pascal VOC, so you should try to use: https://github.com/AlexeyAB/darknet/blob/master/scripts/voc_label.py


Yolo format: class_id relative_x_center relative_y_center relative_width relative_height txt-file should be like this:

1 0.716797 0.395833 0.216406 0.147222
0 0.687109 0.379167 0.255469 0.158333
1 0.420312 0.395833 0.140625 0.166667

Convertation: yolo_x = (xmin+xmax)/2.0 yolo_y = (ymin+ymax)/2.0 yolo_width = xmax - xmin yolo_height = ymax - ymin

kmsravindra commented 6 years ago

Here is the python code that I created, if someone is interested - https://github.com/kmsravindra/useful_scripts/blob/master/Convert_PascalVOC_XML_to_darknet_text.py

Note - I have not found 'class' attribute for an object in the pascal voc xml file (pls let me know if I am missing something). So, for now I have hardcoded the class as 0 as I only have to deal with a single class. In case you have multiple classes, one work around is to modify the above script to create a new class for each unique object_name that is read from the xml file. If you have any better suggestions, then let me know. Thanks!

kmsravindra commented 6 years ago

@AlexeyAB, A quick question - If I zero-pad my images with a black border, then, to recalculate the transformed annotations, can I go ahead with the same relative width and relative height of the labelled object (because the relative width and height have not changed by zero padding the image borders)? The only thing that I will need to take care is to change "yolo_x" and "yolo_y" with the recomputed new coordinates. Am I correct?

AlexeyAB commented 6 years ago

@kmsravindra If the image size is changed in pixels, then you should change (x,y,w,h)-of objects.

Can you provide examples of image? before, after

kmsravindra commented 6 years ago

Please find the images below.

  1. The red box is the bounding box
  2. green circle is the object
  3. black border is the padding done to all input images (of different aspects and sizes) into a uniform size and aspect ratio.

So I need to accordingly transform the annotations done on such input images when we transform the input images to uniform size and aspect.

Previous image - previous_image

Padded image with black border ( notice that the aspect ratio and the size of the overall image changed but the image contents and the object are not distorted) . So as you mentioned, maybe I need to recompute all of them x_yolo, y_yolo , yolo_width and yolo_height with the new dimensions of the transformed image? padded_image

AlexeyAB commented 6 years ago

Yes, you should recompute all in this case: x_yolo, y_yolo , yolo_width and yolo_height.

Because image_size is changed, then relative sizes will be changed too.

kmsravindra commented 6 years ago

Thank you very much

sinaikh commented 6 years ago

hi

i downloaded a data set that just contain .xml files (voc format) and jpeg images i trying to convert it to yolo format , i use this script "https://github.com/AlexeyAB/darknet/blob/master/scripts/voc_label.py"

but i get this error

"C:\Users\Mascia\Desktop\deep_learning\VOCdevkit>python.exe voc_label.py Traceback (most recent call last): File "voc_label.py", line 50, in image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split() IOError: [Errno 2] No such file or directory: 'VOCdevkit/VOC2012/ImageSets/Main/train.txt'"

What should i do? thank you for your help

sharoseali commented 6 years ago

@sinaikh It seems u have downloaded test set not for training .. Download from the link below and then try.. http://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar

MuhammadAsadJaved commented 5 years ago

@kmsravindra @dexception Have you wrote the script for this task? I also have almost similar problem. I have cropped my original images and now i want to update the original annotation file according to the cropped regions. I convert 1 image to 4 new images by cropping them to equal size. So How can i convert one original annotation file to 4 new annotation files according to the cropped regions? I have both XML and txt annotation files.