Closed chrisrapson closed 1 year ago
Thank you @chrisrapson I will check it out.
@chrisrapson , how is this different than the Segmentation support described here https://github.com/pylabel-project/pylabel/issues/65
If this is something different than the segmentation support. Could you possibly provide a sample dataset so I can test it out?
It's definitely similar, but not quite the same. The COCO dataset is the canonical example. They explain the format for keypoints here: https://cocodataset.org/#format-data
Here's another dataset that has keypoint labels in the VOC format https://sites.google.com/view/animal-pose/
It looks like it would be challenging to include both segmentation and keypoint data in a YOLO-formatted file, because in a YOLO-formatted file data is interpreted based on its position within a list. Segmentations are an arbitrarily long list of pairs of floats. Keypoints are a list of triplets of floats. The number of keypoints should be the same for all images in a dataset, but won't be the same across datasets. It wouldn't be possible to know when the list of segmentations ended, and the list of keypoints began (or vice versa).
I think it would be simplest to restrict users to convert only one of either keypoints or segmentations to YOLO. I can't think of a use case where somebody would train a network that needs both segmentation and keypoint data. That is, add a flag keypoints
that has equivalent functionality to your segmentations
option. Then enforce that maximum one of segmentations
or keypoints
can be True
.
Thank you. I had never used key points before but I am getting it. Could you help me with a few more things: 1) Can you recomend a coco dataset that I can use to test it? Should I just get one of the segmentation ones here? https://cocodataset.org/#download 2) Could you add a doc string to the export function that explains this new functionality?
The keypoints task and the segmentation challenges use the same images. The annotations are saved in person_keypoints_train2017.json
and person_keypoints_val2017.json
instead of instances_train2017.json
.
One possible place to download them is from huggingface: https://huggingface.co/datasets/merve/coco/tree/main/annotations
Good idea about the doc string. I'll add that and a boolean keypoints
flag, and then update the PR.
See the two extra commits. The first adds a docstring and a boolean flag. For the second I added the capability to export keypoints to COCO.
=================================== FAILURES =================================== _ test_exportcoco
coco_dataset = <pylabel.dataset.Dataset object at 0x7f5c2191fc10>
def test_export_coco(coco_dataset):
path_to_coco_export = coco_dataset.export.ExportToCoco()
tests/test_main.py:174:
self = <pylabel.exporter.Export object at 0x7f5c2191f310>, output_path = None cat_id_index = None
def ExportToCoco(self, output_path=None, cat_id_index=None):
"""
Writes COCO annotation files to disk (in JSON format) and returns the path to files.
Args:
output_path (str):
This is where the annotation files will be written. If not-specified then the path will be derived from the path_to_annotations and
name properties of the dataset object.
cat_id_index (int):
Reindex the cat_id values so that they start from an int (usually 0 or 1) and
then increment the cat_ids to index + number of categories continuously.
It's useful if the cat_ids are not continuous in the original dataset.
Some models like Yolo require starting from 0 and others like Detectron require starting from 1.
Returns:
A list with 1 or more paths (strings) to annotations files.
Example:
>>> dataset.exporter.ExportToCoco()
['data/labels/dataset.json']
"""
# Copy the dataframe in the dataset so the original dataset doesn't change when you apply the export tranformations
df = self.dataset.df.copy(deep=True)
# Replace empty string values with NaN
df = df.replace(r"^\s*$", np.nan, regex=True)
pd.to_numeric(df["cat_id"])
df["ann_iscrowd"] = df["ann_iscrowd"].fillna(0)
if cat_id_index != None:
assert isinstance(cat_id_index, int), "cat_id_index must be an int."
_ReindexCatIds(df, cat_id_index)
df_outputI = []
df_outputA = []
df_outputC = []
list_i = []
list_c = []
json_list = []
pbar = tqdm(desc="Exporting to COCO file...", total=df.shape[0])
for i in range(0, df.shape[0]):
images = [
{
"id": df["img_id"][i],
"folder": df["img_folder"][i],
"file_name": df["img_filename"][i],
"path": df["img_path"][i],
"width": df["img_width"][i],
"height": df["img_height"][i],
"depth": df["img_depth"][i],
}
]
# Skip this if cat_id is na
if not pd.isna(df["cat_id"][i]):
annotations = [
{
"image_id": df["img_id"][i],
"id": df.index[i],
"segmented": df["ann_segmented"][i],
"bbox": [
df["ann_bbox_xmin"][i],
df["ann_bbox_ymin"][i],
df["ann_bbox_width"][i],
df["ann_bbox_height"][i],
],
"area": df["ann_area"][i],
"segmentation": df["ann_segmentation"][i],
"iscrowd": df["ann_iscrowd"][i],
"pose": df["ann_pose"][i],
"truncated": df["ann_truncated"][i],
"category_id": int(df["cat_id"][i]),
"difficult": df["ann_difficult"][i],
}
]
# include keypoints, if available
if "ann_keypoints" in df.keys():
n_keypoints = int(len(df["ann_keypoints"][i]) / 3) # 3 numbers per keypoint: x,y,visibility
E TypeError: object of type 'numpy.float64' has no len()
pylabel/exporter.py:821: TypeError ----------------------------- Captured stderr call -----------------------------
Exporting to COCO file...: 0%| | 0/4888 [00:00<?, ?it/s]
=============================== warnings summary ===============================
../../../../../opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/jupyter_bbox_widget/bbox.py:48
/opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/jupyter_bbox_widget/bbox.py:48: DeprecationWarning: Traits should be given as instances, not types (for example, Int()
, not Int
). Passing types is deprecated in traitlets 4.1.
classes = List(Unicode).tag(sync=True)
../../../../../opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/jupyter_bbox_widget/bbox.py:50
/opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/jupyter_bbox_widget/bbox.py:50: DeprecationWarning: Traits should be given as instances, not types (for example, Int()
, not Int
). Passing types is deprecated in traitlets 4.1.
colors = List(Unicode, [
I think this is the relevant error message
DeprecationWarning: Traits should be given as instances, not types (for example,
Int(), not
Int). Passing types is deprecated in traitlets 4.1. colors = List(Unicode, [
You can find the instructions to run the tests manually here https://github.com/pylabel-project/pylabel/tree/dev/tests
@chrisrapson I have cherry picked your commits for the yolo output and released it in the latest package, v52. Thank you!
For to Coco export, the issue is in this part of the code:
~/Code/scratch/pylabel/pylabel/exporter.py in ExportToCoco(self, output_path, cat_id_index)
827 if "ann_keypoints" in df.keys():
828 n_keypoints = int(
--> 829 len(df["ann_keypoints"][i]) / 3
830 ) # 3 numbers per keypoint: x,y,visibility
831 annotations[0]["num_keypoints"] = n_keypoints
TypeError: object of type 'float' has no len()
Is the issue the [i]
in len(df["ann_keypoints"][i])
?
I think I've fixed it now. I only tested it on my dataset which had keypoint labels for all images. I misunderstood the logic that a dataset (or image) with no keypoint labels wouldn't have "ann_keypoints"
in its list of keys
, but of course it has that column filled with ""
which are converted to np.nan
.
Once I found the test and ran it, it wasn't too hard to implement the if statement properly. I've updated the PR.
There's still no automatic test that really verifies the new feature, but that would require adding a new dataset which had keypoint labels.
Thank you @chrisrapson . I Merged it and published it in the latest release .v53.
It would be awesome to have a sample notebook to demo the functionality for others to add to the library at https://github.com/pylabel-project/samples.
Is that something you would be able to do do (someday)?
In case you're interested in converting keypoints as well as bounding boxes.
Following the format from here: https://github.com/WongKinYiu/yolov7/issues/1267
I did some manual tests on images with: