deepmodeling / abacus-develop

An electronic structure package based on either plane wave basis or numerical atomic orbitals.
http://abacus.ustc.edu.cn
GNU Lesser General Public License v3.0
173 stars 134 forks source link

Lack of Error Checking of base_device memory operators #4959

Closed Cstandardlib closed 1 month ago

Cstandardlib commented 3 months ago

Describe the Code Quality Issue

// namespace base_device::memory
// in module_base/module_device/memory_op.cpp
template <typename FPTYPE>
struct resize_memory_op<FPTYPE, base_device::DEVICE_CPU>
{
    void operator()(const base_device::DEVICE_CPU* dev, FPTYPE*& arr, const size_t size, const char* record_in)
    {
        if (arr != nullptr)
        {
            free(arr);
        }
        arr = (FPTYPE*)malloc(sizeof(FPTYPE) * size);
        std::string record_string;
        if (record_in != nullptr)
        {
            record_string = record_in;
        }
        else
        {
            record_string = "no_record";
        }

        if (record_string != "no_record")
        {
            ModuleBase::Memory::record(record_string, sizeof(FPTYPE) * size);
        }
    }
};
template <typename FPTYPE>
struct delete_memory_op<FPTYPE, base_device::DEVICE_CPU>
{
    void operator()(const base_device::DEVICE_CPU* dev, FPTYPE* arr)
    {
        free(arr);
    }
};

Description

The provided code snippet contains two template specializations for memory management on a CPU device.

Error Checking After Allocation

There is no error checking after the malloc call. It's good practice to check if the allocation was successful and handle the case where malloc returns nullptr.

Error Checking Before Deallocation

The delete_memory_op struct does not have any logic to handle the case where arr is nullptr. Maybe we can add a null check to avoid potential deallocation errors.

By the way, is it safer to use new[] and delete[] for arrays in C++?

WHUweiqingzhou commented 2 months ago

@Cstandardlib could you want to pull a PR to make some changes?

Cstandardlib commented 2 months ago

@WHUweiqingzhou Changes will be made later, along with kernel ops to make them more robust and clear. I find that some redundant MPI operations are included in GPU code. These issues will be discussed in depth and I expect to establish a good development standard on these Heterogeneous Operators.

caic99 commented 2 months ago

The delete_memory_op struct does not have any logic to handle the case where arr is nullptr. Maybe we can add a null check to avoid potential deallocation errors.

There's no need to do so