weecology / DeepForest

Python Package for Airborne RGB machine learning
https://deepforest.readthedocs.io/
MIT License
466 stars 170 forks source link

Rename the xml_to_annotations to read_pascal_voc #699

Closed bw4sz closed 3 days ago

bw4sz commented 1 month ago

A key goal of DeepForest 2.0 is better connect with the large universe of computer vision packages and reduce things that are particular to DeepForest. The deepforest.utilities.xml_to_annotation function looks like a custom data format, but really its just a common format of pascal_voc.xml

https://roboflow.com/formats/pascal-voc-xml

Can be read about

>>> from deepforest import get_data
>>> get_data("OSBS_029.xml")
'/orange/ewhite/b.weinstein/miniconda3/envs/MillionTrees/lib/python3.10/site-packages/deepforest/data/OSBS_029.xml'
>>> xml_to_annotations(get_data("OSBS_029.xml"))
      image_path  xmin  ymin  xmax  ymax label
0   OSBS_029.tif   203    67   227    90  Tree
1   OSBS_029.tif   256    99   288   140  Tree
2   OSBS_029.tif   166   253   225   304  Tree
3   OSBS_029.tif   365     2   400    27  Tree
4   OSBS_029.tif   312    13   349    47  Tree
..           ...   ...   ...   ...   ...   ...
56  OSBS_029.tif    60   292    96   332  Tree
57  OSBS_029.tif    89   362   114   390  Tree
58  OSBS_029.tif   236   132   253   152  Tree
59  OSBS_029.tif   316   174   346   214  Tree
60  OSBS_029.tif   220   208   251   244  Tree

globox is a nice tool for converting object detection formats

https://github.com/laclouis5/globox

this reads in the xml just fine.

df2 = globox.Annotation.from_pascal_voc(get_data("OSBS_029.xml"))

To do

Mu-Magdy commented 3 weeks ago

Hi @bw4sz

Do you want to use globox instead of xml_to_annotations? or you want to keep the current code in xml_to_annotations and change the name to read_pascal_voc and warnings ?

bw4sz commented 3 weeks ago

I think we want to 1. keep the current code and add a 2.0 deprecation warning. https://github.com/weecology/DeepForest/blob/f1449053932104627f5b38785fb876d380b6cad9/deepforest/preprocess.py#L188 2. copy the function and rename it.