在重新训练时遇到如下报错，排查后发现在backbone提取特征时，有的值变为NaN，应该怎么处理？

junjiehe96 / FastInst

[CVPR2023] FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation

MIT License

175 stars 16 forks source link

在重新训练时遇到如下报错，排查后发现在backbone提取特征时，有的值变为NaN，应该怎么处理？ #17

Open fengchuibeixiang opened 10 months ago

fengchuibeixiang commented 10 months ago

junjiehe96 commented 10 months ago

Sometimes I encountered the same problem. Empirically, replacing NaN values in cost matrix with a large number alleviates this but cannot completely solve it.

# Reference: https://github.com/SHI-Labs/OneFormer/blob/main/oneformer/modeling/matcher.py
def linear_sum_assignment_with_nan(cost_matrix):
    cost_matrix = np.asarray(cost_matrix)
    nan = np.isnan(cost_matrix).any()
    nan_all = np.isnan(cost_matrix).all()
    empty = cost_matrix.size == 0

    if not empty:
        if nan_all:
            print('Matrix contains all NaN values!')
        elif nan:
            print('Matrix contains NaN values!')

        if nan_all:
            cost_matrix = np.empty(shape=(0, 0))
        elif nan:
            cost_matrix[np.isnan(cost_matrix)] = 100

    return linear_sum_assignment(cost_matrix)