Cambricon / mlu-ops

Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .
MIT License
103 stars 102 forks source link

[Feature](mluOpLgamma) add new operator lgamma #1012

Closed Frankd35 closed 1 month ago

Frankd35 commented 6 months ago

Thanks for your contribution and we appreciate it a lot. :rocket::rocket:

1. Motivation

add new operator lgamma

2. Modification

add implementation of lgamma

3. Test Report

not yet

3.1 Modification Details

3.1.1 Accuracy Acceptance Standard

For static threshold standard details, see: MLU-OPS™ Accuracy Acceptance Standard.

3.1.2 Operator Scheme checklist

3.2 Accuracy Test

3.2.1 Accuracy Test

If you have checked the following items, please tick the relevant box.

[       OK ] lgamma/TestSuite.mluOp/70 (62 ms)
[----------] 71 tests from lgamma/TestSuite (3641 ms total)

[----------] Global test environment tear-down
[ SUMMARY  ] Total 71 cases of 1 op(s).
ALL PASSED.
[==========] 71 test cases from 1 test suite ran. (8300 ms total)
[  PASSED  ] 71 test cases.

3.2.2 Parameter Check

Test Point-1: When a new operator is submitted, the test points are given and the test results are stated. Acceptance Standard: Normal error.

Please fill your test results(Error Message) in here, ...

Test Point-2: Whether illegal parameters are passed. Acceptance Standard: Normal error.

Test results...

3.3 Performance Test

See MLU-OPS™ Performance Acceptance Standard for details.

Platform:MLU370

3.4 Summary Analysis

the v1.0 lgamma implemtentation is a simd operator, and lack of stride support

Please give a brief overview here, if you want to note and summarize the content.

Frankd35 commented 1 month ago

compute.py

import torch
import numpy as np
from nonmlu_ops.base import *

@registerTensorList("lgamma")
class LgammaTensorList(TensorList):
    pass

@registerOp("lgamma")
class LgammaOp(OpTest):
    def __init__(self, tensorlist, params):
        super().__init__(tensorlist, params)

    def compute(self):
        # 确定输出张量
        input_tensor = self.tensor_list_.getInputTensor(0)
        output_tensor = self.tensor_list_.getOutputTensor(0)
        output_shape = self.tensor_list_.getInputTensor(0).getShape()
        datatype = input_tensor.getDataType().getNumpyStr()

        if datatype == 'float16':
            torch_input = torch.tensor(input_tensor.getDataNode().getData()).half().cuda()
        else:
            torch_input = torch.tensor(input_tensor.getDataNode().getData()).float().cuda()

        # compute baseline
        # torch_input = torch.tensor(input_tensor.getData()).cuda()
        torch_output = torch_input.new_zeros(output_shape)
        device = torch.device("cuda:0")
        lgamma_result = torch.lgamma(torch_input)

        input_has_inf = np.isinf(torch_input.cpu()).any()
        input_has_nan = np.isnan(torch_input.cpu()).any()
        result_has_inf = np.isinf(lgamma_result.cpu()).any()
        result_has_nan = np.isnan(lgamma_result.cpu()).any()

        # 转移数据到 CPU 并转换为 NumPy 数组
        lgamma_result = lgamma_result.cpu().numpy()

        # 设置输出张量的形状和数据
        output_tensor.setShape(lgamma_result.shape)
        output_tensor.setData(lgamma_result)

        # if dynamic threshold
        if self.params_.get("if_dynamic_threshold", False):
            base_node = DataNode("double")
            torch_input_fp64 = torch_input.double()
            lgamma_result_fp64 = torch.lgamma(torch_input_fp64)
            base_node.setData(lgamma_result_fp64.cpu().numpy())
            eva = diff_utils.Evaluator(base_node, output_tensor.getDataNode())
            output_tensor.setData(lgamma_result_fp64.cpu().numpy())
            if(input_has_inf or input_has_nan or result_has_inf or result_has_nan):
              output_tensor.setDiff( 0.003, 0.003)  
            else:
              output_tensor.setDiff(eva.computeDiff1(), eva.computeDiff2())

@registerProtoWriter("lgamma")
class LgammaProtoWriter(MluOpProtoWriter):
    pass
Frankd35 commented 1 month ago

lgamma_float.json lgamma_float_infnan.json lgamma_float_stride.json lgamma_half.json lgamma_half_infnan.json lgamma_half_stride.json

Frankd35 commented 1 month ago

测试点: 以上 json 中包含不同数据类型(float, half)、不同维度张量、不同输入范围的测试,还包括原位支持测试、输入inf/nan 测试、极端情况下的精度测试(输入范围靠近 0),均通过 防呆测试: 输入空张量 -- 通过 输入张量与输出张量 shape 不相同 -- 通过 输入张量与输出张量类型不相同 -- 通过 输入张量与输出张量不合法 -- 通过

DanieeelLiu commented 6 days ago

lgammacase.zip 需debug case