Closed mikeyEcology closed 4 years ago
Hi Mikey,
Thank you very much for the questions! I will try and answer them below:
The main file in the data download ("aide_query") is a text file with user annotations; one per line and with tokens separated by a semicolon (basically like a CSV file, but as said ";"-separated). The first line is the header and contains the field names for all lines to follow. In your example, these are in order:
The segmentation masks are stored as TIFF files with indexed color: for example, if the mask contains pixels annotated as class with index number 4, those pixels will contain the actual value 4, and not the color assigned to the class. I will add a flag that allows users to decide whether they want indexed or real colors for simplicity.
The segmentation mask name is a good suggestion, thank you! I might add a text field that allows users to add a default prefix (or suffix) to the file name. Regarding the file format, I advise against using JPEG. The reason is that JPEG uses a lossy compression format (i.e., it reduces the file size, but alters the pixel values to do so). The result is artifacts on contrasting edges (transitions between label classes), such as blurriness or even interpolation values between label class colors, which has detrimental effects on the quality of the segmentation mask. For this reason, AIDE exports all masks as TIFF files. If you wish to still convert them to JPEGs, the easiest solution at the moment is to employ a third-party batch conversion tool.
Thanks a lot once more for your comments and suggestions! The data download function is still work in progress and will improve in functionality and also documentation over time.
Thank you so much for your detailed explanations! I have a few followup questions regarding the aide_query file. Sorry if the answers are obvious, but maybe others will have the same questions. I'll go back through the points by number:
Thank you for all of the details. These categories make sense, but there are still a couple I don't understand. For image
, how does link up with an image in my dataset? For my example, how would I find what filename is associated with this image "ed79e933-a83a-4c40-9269-9f75c6cf0b03"? Likewise, how do I find what is meant by the label
"1ab5e9a8-c08c-11ea-bad2-0242ac160002"? How is this label related to an image class (e.g., animal species)?
This makes perfect sense. I was at first concerned because I've never used TIF for this type of stuff and my image viewing software didn't pick anything up in the image. But I was able to open the segmentation masks in Python and they work well. I think it makes sense to use numbers for the masks. One utility that might help is getting to choose which number is used for each class. This could help users who are combining with another dataset.
Thank you. If it's easier I'm less concerned about specifying the name of the mask as I am about ensuring that I understand what image it is associated with.
Hello Mikey,
The latest version of AIDE now features the following improvements and new functionalities:
More comprehensive querying: in addition to the bare image and label class UUIDs, AIDE now also appends the label class names to each exported annotation, resp. prediction. In the case of segmentation masks, it now exports a dedicated comma-separated file ("labelclasses.csv"), which contains the label class IDs ("id"), names ("name"), assigned colors ("color"), parent group ("labelclassgroup") and, importantly, the index value in the segmentation mask TIFF files ("labelclass_index"). So if your label class "elephant" has a "labelclass_index" value of 5, all pixels marked as "elephant" will contain the value 5.
Customizable formatting: it is now possible to exclude bulky query fields (the browser metadata) by unticking a checkbox in the "Data Download" page. For segmentation masks, AIDE now offers an option to save the TIFF files with the same name (and folder structure) as the original images instead of their UUIDs. Segmentation mask names can further be customized with user-specified prefix and suffix strings.
In sum, you now get the plain label class name as a token for every annotation/prediction, and you can now export segmentation masks with the same (or similar) name like their associated original images.
Thank you so much for making these modifications! It will make it a lot easier for me to work with it.
This software is great and I'm excited to get using it! I have a few questions about extracting data; I want to ensure that I get this sorted out before having the data labeled.
aide_quiery_[date/time]
(pasted below).Contents of
aide_query
file: