PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.26k stars 5.6k forks source link

Segmentation fault and Aborted in `paddle.index_fill_`. #65044

Open Zoeeeeey opened 5 months ago

Zoeeeeey commented 5 months ago

bug描述 Describe the Bug

您好,在使用paddle.index_fill_时,似乎存在以下问题:

paddle.index_fill_输入的index为2-D, 3-D的Tensor时,出现Aborted (core dumped)或Segmentation fault (core dumped)。

同时可能出现munmap_chunk(): invalid pointer等情况。

TEST1:

import paddle

x=paddle.rand(shape=(1, 2), dtype=paddle.float64)
index=paddle.randint(low=0, high=100, shape=(2, 4, 5), dtype=paddle.int32)
axis = 0
value=0.38512939509930566

paddle.index_fill_(x=x,index=index,axis=axis,value=value)

报错信息:

WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
munmap_chunk(): invalid pointer
Segmentation fault (core dumped)

TEST2:

import paddle

x=paddle.rand(shape=(1, 2), dtype=paddle.float64)
index=paddle.randint(low=0, high=100, shape=(2, 4, 5), dtype=paddle.int32)
axis=1
value=0.38512939509930566

paddle.index_fill_(x=x,index=index,axis=axis,value=value)

报错信息:

WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
free(): invalid next size (fast)

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   phi::DenseTensor::~DenseTensor()
1   std::_Sp_counted_deleter<phi::Allocation*, std::function<void (phi::Allocation*)>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
2   paddle::memory::allocation::StatAllocator::FreeImpl(phi::Allocation*)
3   paddle::memory::allocation::CPUAllocator::FreeImpl(phi::Allocation*)

----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
  [TimeInfo: *** Aborted at 1715252434 (unix time) try "date -d @1715252434" if you are using GNU date ***]
  [SignalInfo: *** SIGABRT (@0x1779a) received by PID 96154 (TID 0x7f7efa9eb280) from PID 96154 ***]

Aborted (core dumped)

TEST3:

import paddle
input_tensor = paddle.to_tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype='int64')
index=paddle.randint(low=0, high=100, shape=(2, 4, 5), dtype=paddle.int32)
value = 0.38512939509930566
res = paddle.index_fill_(input_tensor, index, 1, value)

print(input_tensor)
print(res)

TEST3重复运行后,报错信息表现不一致:

WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
terminate called after throwing an instance of 'std::bad_function_call'
  what():  bad_function_call

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::pybind::eager_api_index_put_(_object*, _object*, _object*)
1   index_put__ad_func(paddle::Tensor&, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, paddle::Tensor const&, bool)
2   paddle::experimental::index_put_(paddle::Tensor&, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, paddle::Tensor const&, bool)
3   void phi::IndexPutKernel<long, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, std::vector<phi::DenseTensor const*, std::allocator<phi::DenseTensor const*> > const&, phi::DenseTensor const&, bool, phi::DenseTensor*)
4   std::vector<phi::DenseTensor, std::allocator<phi::DenseTensor> >::~vector()
5   phi::DenseTensor::~DenseTensor()

----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
  [TimeInfo: *** Aborted at 1715252958 (unix time) try "date -d @1715252958" if you are using GNU date ***]
  [SignalInfo: *** SIGABRT (@0x1852c) received by PID 99628 (TID 0x7fa7bab36280) from PID 99628 ***]

Aborted (core dumped)
WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
Tensor(shape=[3, 3], dtype=int64, place=Place(cpu), stop_gradient=True,
       [[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Tensor(shape=[3, 3], dtype=int64, place=Place(cpu), stop_gradient=True,
       [[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
No stack trace in paddle, may be caused by external reasons.

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1715252963 (unix time) try "date -d @1715252963" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x8) received by PID 99692 (TID 0x7f550e7ea280) from PID 8 ***]

Segmentation fault (core dumped)
WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
free(): invalid pointer

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::pybind::eager_api_index_put_(_object*, _object*, _object*)
1   index_put__ad_func(paddle::Tensor&, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, paddle::Tensor const&, bool)
2   paddle::experimental::index_put_(paddle::Tensor&, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, paddle::Tensor const&, bool)
3   void phi::IndexPutKernel<long, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, std::vector<phi::DenseTensor const*, std::allocator<phi::DenseTensor const*> > const&, phi::DenseTensor const&, bool, phi::DenseTensor*)

----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
  [TimeInfo: *** Aborted at 1715252922 (unix time) try "date -d @1715252922" if you are using GNU date ***]
  [SignalInfo: *** SIGABRT (@0x18430) received by PID 99376 (TID 0x7f01b196e280) from PID 99376 ***]

Aborted (core dumped)
WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
free(): invalid pointer
python: malloc.c:4036: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
Aborted (core dumped)

Version

paddlepaddle - 2.6.1

其他补充信息 Additional Supplementary Information

该问题已于5月9日、5月21日从邮箱进行反馈,似乎暂时没有收到任何回复。

Birdylx commented 5 months ago

@Zoeeeeey 可以详细阅读下api文档说明,index只能是1D的,https://github.com/PaddlePaddle/Paddle/blob/131999233ef997fc8d3f24b27830925b78cf17aa/python/paddle/tensor/manipulation.py#L5989 如果需要更灵活的fill方式,你可以尝试使用下paddle.put_along_axis

luotao1 commented 3 months ago

该问题已于5月9日、5月21日从邮箱进行反馈,似乎暂时没有收到任何回复。

请问是给哪个邮箱发的邮件呢?

Zoeeeeey commented 2 months ago

@luotao1您好!我发送的邮箱是:paddle-security@baidu.com

luotao1 commented 2 months ago

您好,麻烦再发送一遍!

Zoeeeeey commented 2 months ago

您好,麻烦再发送一遍!

您好,目前已发送~