自己制作数据集的“训练-测试指标”格式转换问题

fourierer commented 10 months ago

您好，非常感谢您的开源工作！我这边有一个小疑问请教您～我自己有一批特定场景的数据想用您的模型进finetune，我自己的数据集的格式都是四点标注的格式（相当于不规则的4点多边形，但每个text instance起始点不固定，方向顺时针逆时针都有），即[x1,y1,x2,y2,x3,y3,x4,y4]，通过均匀采样点可以生成类似于datasets/ctw1500/test_poly.json中16点顺时针格式，但是我发现最终输出指标的时候使用的是datasets/evaluation/gt_ctw1500.zip中14点逆时针格式，请问gt_ctw1500.zip的格式是必须的么，还是随意一个固定方向的多边形标注点格式即可，比如一张图有的text instance是从左上角开始顺时针标注的，有的text instance是从右下角开始逆时针标注是否可行？

lerndeep commented 9 months ago

@fourierer Could you please let me know how can you transfer [x1,y1,x2,y2,x3,y3,x4,y4] into 16 points polygon? could you please share your code to tranfer. How to prepare data for training, data example: x1,y1,x2,y2,x3,y3,x4,y4,text x1,y1,x2,y2,x3,y3,x4,y4,text x1,y1,x2,y2,x3,y3,x4,y4,text Each annotation text file contains bounding box coordinates with corresponding text.

oszn commented 9 months ago

ABCNet，内对于这种做法是，讲长边的三等分插入2个点，生成8个点。我用gpt生成了一份代码，可以试试

import numpy as np

def distance(p1, p2):
    """计算两点间的距离"""
    return np.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2)

def order_rectangle_points(points):
    """对矩形的四个点进行排序以确保它们是有序的（按顺时针或逆时针）"""
    center = np.mean(points, axis=0)
    sorted_points = sorted(points, key=lambda p: np.arctan2(p[1] - center[1], p[0] - center[0]))
    return np.array(sorted_points)

def find_long_edges_and_divide(points):
    """找到长边，进行3等分，并返回有序的8个点"""
    points = order_rectangle_points(points)
    distances = [distance(points[i], points[(i+1)%4]) for i in range(4)]
    long_edges_indices = sorted(range(4), key=lambda i: distances[i], reverse=True)[:2]
      """ 为了保持顺序，始终从左上角的点开始    """
    if 0 not in long_edges_indices:
        long_edges_indices = [i-1 if i-1 >= 0 else 3 for i in long_edges_indices]

    new_points = []
    for i, point_index in enumerate(long_edges_indices):
        p1, p2 = points[point_index], points[(point_index+1)%4]
        division1 = [(2*p1[0] + p2[0]) / 3, (2*p1[1] + p2[1]) / 3]
        division2 = [(p1[0] + 2*p2[0]) / 3, (p1[1] + 2*p2[1]) / 3]
        if i == 0:
            new_points += [p1.tolist(), division1, division2,p2.tolist()]
        else:
            new_points += [p1.tolist(), division1, division2, p2.tolist()]

       """ 确保点的顺序与原始矩形的顺序一致    """
    return new_points + [points[(long_edges_indices[1]+1)%4].tolist()]

    """ 示例点坐标    """
points = np.array([
    [1, 1],  # 随机排序的点
    [1, 3],
    [4, 1],
    [4, 3]
])

    """找到长边并在3等分点插值2个点    """
ordered_points = find_long_edges_and_divide(points)

print("新的点坐标按顺序排列:")
for point in ordered_points:
    print(point)

Hardik1608 commented 9 months ago

@lerndeep Were you able to transfer your data into 16 points polygon? If so, please share how. Thanks

fourierer commented 9 months ago

@lerndeep @Hardik1608 我自己是这样转的（这里要求文本框的标注顺序一定是从某个点开始进行逆时针或者顺时针标注）：（1）首先把标注的四点形式转成顺时针；（2）找出最“左上角”的点作为起始点；（3）然后在第1第2个点中插值，在第3第4个点中间插值得到16点形式的标注；具体的函数如下： def is_clockwise(polygon): # 初始化叉积总和 cross_product_sum = 0 # 计算所有相邻边对的叉积和 for i in range(len(polygon)): p1 = polygon[i] p2 = polygon[(i + 1) % len(polygon)] p3 = polygon[(i + 2) % len(polygon)] # 计算两个向量的叉积 cross_product = (p2[0] - p1[0]) * (p3[1] - p1[1]) - (p2[1] - p1[1]) * (p3[0] - p1[0]) cross_product_sum += cross_product # 如果叉积和为正，则是顺时针 return cross_product_sum > 0 def is_clockwise(polygon): # 初始化叉积总和 cross_product_sum = 0 # 计算所有相邻边对的叉积和 for i in range(len(polygon)): p1 = polygon[i] p2 = polygon[(i + 1) % len(polygon)] p3 = polygon[(i + 2) % len(polygon)] # 计算两个向量的叉积 cross_product = (p2[0] - p1[0]) * (p3[1] - p1[1]) - (p2[1] - p1[1]) * (p3[0] - p1[0]) cross_product_sum += cross_product # 如果叉积和为正，则是顺时针 return cross_product_sum > 0 def make_clockwise(polygon): # 如果多边形是逆时针排列的，则反转顶点列表 if not is_clockwise(polygon): polygon.reverse() return polygon def rearrange_polygon(polygon): # 找到x+y最小的点索引，这个点将成为新的起始点 start_index = min(range(len(polygon)), key=lambda i: polygon[i][0] + polygon[i][1]) # 从起始点开始重新排列顶点，并将其包装回到列表的开始 rearranged_polygon = polygon[start_index:] + polygon[:start_index] return rearranged_polygon def sample_points(p1, p2, num_samples): # 生成线性插值的t值 ts = [i / (num_samples + 1) for i in range(1, num_samples + 1)] # 根据t值计算并返回插值点 return [[round(p1[0] + t * (p2[0] - p1[0])), round(p1[1] + t * (p2[1] - p1[1]))] for t in ts] def insert_sampled_points(polygon, num_samples): # 在第一二个点之间采样 samples_between_1_and_2 = sample_points(polygon[0], polygon[1], num_samples) # 在第三四个点之间采样 samples_between_3_and_4 = sample_points(polygon[2], polygon[3], num_samples) # 插入采样点到原始多边形中 new_polygon = ( [polygon[0]] + samples_between_1_and_2 + [polygon[1], polygon[2]] + samples_between_3_and_4 + [polygon[3]] ) return new_polygon

你自己改下输入输出的格式就可以直接用了，这样写有个问题就是第1第2个点和第3第4个点的边并不一定就是两条长边，但是我做的场景基本上都是横向的文本检测，所以这种写法没什么问题，如果涉及到大量竖直框那么就需要再调整下。。。。

lerndeep commented 9 months ago

def getEquidistantPoints(p1, p2, parts): return zip(np.linspace(p1[0], p2[0], parts+1), np.linspace(p1[1], p2[1], parts+1)) boxes_ori=boxes.copy() ori_bbb=np.array(boxes).reshape(-1) p_bboxx =[max(int(float(j)),1) for j in ori_bbb] ab=np.reshape(p_bboxx,(-1,2)) boxes_ori=ab.copy() if len(ab)==4: box_x1=list(getEquidistantPoints(ab[0], ab[1], 5)) box_y1=list(getEquidistantPoints(ab[1], ab[2], 3)) box_x2=list(getEquidistantPoints(ab[2], ab[3], 5)) box_y2=list(getEquidistantPoints(ab[3], ab[0], 3)) box_y1_filtered=[box_y1[1]]+[box_y1[2]] box_y2_filtered=[box_y2[1]]+[box_y2[2]] boxes=box_x1+box_y1_filtered+box_x2+box_y2_filtered boxes=np.array(boxes)

This is my solution

ymy-k / DPText-DETR

自己制作数据集的“训练-测试指标”格式转换问题 #34