Phenobench-evaluator does not detect folders in the zip file

niqbal996 commented 8 months ago

Hi, I am trying to upload a submission on the Plant detection competition on the Codalab. My submission is not being processed correctly. I have tried to validate it with the phenobench-evaluator using the following command:

$phenobench-validator --task plant_detection --phenobench_dir /mnt/d/datasets/phenobench/ --zipfile dummy.zip
Validating zip archive "DETA_submission_50.zip".

 ========== plant_detection ========== 
  1. Checking filename.............. ✓
  2. Checking directory structure... 

  >>> Error: Directory "plant_bboxes" missing inside zip file. Your zip file should contain following folders: plant_bboxes

I also reproduced the example with python source at phenobench/src/phenobench/tools/validator.py and I noticed that at line https://github.com/PRBonn/phenobench/blob/0edc128ef7f67c8c6577554c7d1a2e382e2ea81f/src/phenobench/tools/validator.py#L67 would give empty list with my zip file. I have attached a dummy zip file it here for reproducibility. dummy.zip If I understand the instructions given here https://codalab.lisn.upsaclay.fr/competitions/14178#learn_the_details-evaluation correctly, the zip file has the correct folder structure, but the validator detects no folder named plant_bboxes. Maybe this resulted in different zipfile version?

Phenobench version: 0.1.0

I could open a pull request if that is indeed not the expected behaviour. Thank you.

jbehley commented 8 months ago

Hi @niqbal996,

thanks for the information; I can confirm that the validator doesn't correctly recognize the folder. I have to investigate why this doesn't work (anymore?)
Regarding your submission, I noticed that your text files in the plant_bboxes folder are just empty. Therefore, the evaluation script correctly reports an AP of 0.0. I can remove the "non-intentional" and obviously wrong submissions, but please ensure that you output the files (maybe you have to close the files to trigger the writing?).

niqbal996 commented 8 months ago

Hi @jbehley, The dummy file i have attached was just to reproduce the directory error. I have checked my submissions on Codalab they do contain detections in the text files. But I can also send you the original submission zip file (email?), if you could help me fix the error in the label files, that will be really helpful.

jbehley commented 8 months ago

No need to send me the files; I can access the submission via Codalab and I can have a look. I'm not sure what then goes wrong and have to visualize the detections and see if it's a problem of our evaluation script.

I will have probably in the next few days some time to look over it, therefore, expect a bit of delay.

jbehley commented 7 months ago

I finally managed to look into the issue. Sorry that it took a bit longer.

I confirm: there seems to be a bug in the validator, but currently I have no clue why your zip file has not a dedicated directory entry and that causes the error. But this is something you don't have to worry about.
I could identify two problems with the submission files that you provided to Codalab:

a. It seems that your class_id seems to be wrong (crop should be 1, weed should be 2); you have 0 as class id, which corresponds to background.

b. And your bounding boxes are somehow wrongly located, you seem to have always centers that are close to zero. As a sanity check you should always draw your predictions (that's what I get out for a prediction file from your last submission):

Hope that helps to debug your code.

niqbal996 commented 6 months ago

Hi @jbehley , Thanks for looking into the submission. I have tried it again with the correct class IDs as you mentioned. 1 for crops and 2 for weeds. Regarding the second point, I am still a bit confused about the description on the challenge page here Plant counting challenge According to the competition description page, Each text file (e.g., phenoBench_00000.txt) should contain per detection a line according to the following YOLO-based file format: class_id x y width height confidence where all values are separated with a space and the values for x,y,width, and height are normalized to the image size (i.e., values between 0 and 1) Following the YOLO format from ultralytics YOLOv8 An example entry in the text file would be:

1 0.9665313959121704 0.8584926128387451 0.06693726778030396 0.09014678001403809 0.8831634

2 0.006839825306087732 0.19019357860088348 0.013286247849464417 0.016970008611679077 0.8164061

and all values have to be multiplied with 1024 i.e. image size (hence the values close to 0) to get the absolute values of the bounding boxes. I visualized it locally and I get the following boxes. prediction_screenshot_14 03 2024 Could you have a look and let me know if I misunderstood something? Thank you.

jbehley commented 5 months ago

yes, all values need to be divided by the image width/height, which is in our case 1024; thus your file should look something like this:

1 0.77832 0.105988 0.34668 0.210388 0.98291
1 0.824219 0.880371 0.0839844 0.124023 0.975586
1 0.806641 0.429688 0.100586 0.0737305 0.973633
1 0.2453 0.237305 0.370728 0.365723 0.973633
1 0.808594 0.646973 0.0849609 0.0859375 0.97168
1 0.260254 0.10611 0.346191 0.209656 0.969727
1 0.795898 0.247803 0.0917969 0.081543 0.960938
1 0.268127 0.708496 0.268433 0.30957 0.95752
1 0.291809 0.476562 0.272827 0.237305 0.937988
1 0.285889 0.897949 0.191895 0.202148 0.932617
2 0.921875 0.472656 0.0195312 0.0161133 0.883789
2 0.529297 0.657715 0.0283203 0.0214844 0.823242

which are taken from the results of YOLOv7 that we published at https://github.com/PrBOnn/phenobench-baselines (only the high-confidence results).

In your latest submission, the width/height seems not to be normalized to [0,1] and therefore cannot be correctly scored.

I'm currently working on a viewer to show the predictions/results conveniently, but I have a lot of other duties currently that have higher priority. However, I read the bounding boxes with the following code:

def read_bboxes(filename: str, img_width=1024, img_height=1024) -> List[Dict]:
  bboxes = []

  with open(filename) as f:
    for line in f.readlines():
      cid, cx, cy, w, h, conf = line.strip().split(" ")[:6]
      bboxes.append(
        {
          "label": int(cid),
          "center": (float(cx) * img_width, float(cy) * img_height),
          "width": float(w) * img_width,
          "height": float(h) * img_height,
          "confidence": float(conf),
        }
      )

  return bboxes

and then use the draw_bboxes from the visualization.py: https://github.com/PRBonn/phenobench/blob/0edc128ef7f67c8c6577554c7d1a2e382e2ea81f/src/phenobench/visualization.py#L77-L107.

PRBonn / phenobench

Phenobench-evaluator does not detect folders in the zip file #2