pgcentralfoundation / pgrx

Build Postgres Extensions with Rust!
Other
3.55k stars 237 forks source link

Possible memory leak #1330

Open ccleve opened 11 months ago

ccleve commented 11 months ago

I'm developing an index access method. In the ambuild method / build_callback I call a pg_extern function to process a string column that is coming from the table getting indexed.

If I hard-code the function and call it directly, there is no problem. If I store a reference to the function and call it dynamically, I get a leak. The function is a support function defined in an op class.

Here is the simplified code:

let func: &mut FmgrInfo = index_getprocinfo(indexrel, attr_num, MY_SUPPORT_PROC_NUM);
let result: Datum = FunctionCall1Coll(func, InvalidOid, val);
let tokens: Tokens = FromDatum::from_datum(result, false).unwrap();

This works, but there's a memory leak that grows to gigabytes and then goes away when indexing is done.

Large numbers of 8k blocks are getting allocated and not released. I used Instruments to track the allocations. Here's the call stack for one of the allocations:

AllocSetAlloc   
palloc  
initStringInfo  
makeStringInfo  
pgrx_pg_sys::include::pg16::makeStringInfo::_$u7b$$u7b$closure$u7d$$u7d$::ha67b83a70b8da36b 
pgrx_pg_sys::include::pg16::makeStringInfo::h52a9eb4e08c9e507   
pgrx::stringinfo::StringInfo::new::h14b298594cbdae1b    
pgrx::datum::varlena::cbor_encode::had528f965d5a8900    
pgrx::datum::varlena::_$LT$impl$u20$pgrx..datum..into..IntoDatum$u20$for$u20$T$GT$::into_datum::h8b9acf0b0f90092b   
relevantdb::pipeline::standard_pipeline::std_index_pipe_wrapper::std_index_pipe_wrapper_inner::h0b84d5a6b1c79aed    
relevantdb::pipeline::standard_pipeline::std_index_pipe_wrapper::_$u7b$$u7b$closure$u7d$$u7d$::hfe83f2098858fcdc    
std::panicking::try::do_call::hd4c8272fb472bd5d 
__rust_try  
std::panicking::try::he4259e6d44ebdbc5  
std::panic::catch_unwind::h29981eda532d524a 
pgrx_pg_sys::submodules::panic::run_guarded::h06df0a8400822769  
pgrx_pg_sys::submodules::panic::pgrx_extern_c_guard::h0e55d85a558e12fb  
std_index_pipe_wrapper  
FunctionCall1Coll   
pgrx_pg_sys::include::pg16::FunctionCall1Coll::_$u7b$$u7b$closure$u7d$$u7d$::hfd2b8f072ef60dde  
pgrx_pg_sys::include::pg16::FunctionCall1Coll::h132fdeac39f4a9fb    
relevantdb::access::build::build_callback_internal::h70bc350fd6c65d8e   
relevantdb::access::build::build_callback::build_callback_inner::h752fe11472b1aaa9  
relevantdb::access::build::build_callback::_$u7b$$u7b$closure$u7d$$u7d$::heed3800ce7d87215  
std::panicking::try::do_call::h235dc22f0d3bf6c1 
__rust_try  
std::panicking::try::h700609f12d2df316  
std::panic::catch_unwind::h416c407bf0546f94 
pgrx_pg_sys::submodules::panic::run_guarded::hee92e03a4a348986  
pgrx_pg_sys::submodules::panic::pgrx_extern_c_guard::h117fdc4c7f044bea  
relevantdb::access::build::build_callback::hcdff0de70bedcd5e    
heapam_index_build_range_scan   
relevantdb::access::build::ambuild::ambuild_inner::h1e08ba58ee209300    
relevantdb::access::build::ambuild::_$u7b$$u7b$closure$u7d$$u7d$::hf4e021fa8557db7a 
std::panicking::try::do_call::hf3cae0311d4b0801 
__rust_try  
std::panicking::try::h1485f64214f2a848  
std::panic::catch_unwind::h55b1bc33f0760e59 
pgrx_pg_sys::submodules::panic::run_guarded::hb6b6d34c504f148a  
pgrx_pg_sys::submodules::panic::pgrx_extern_c_guard::h1494dc59682cc685  
relevantdb::access::build::ambuild::hffa4ebd229238b3b   
index_build 
index_create    
DefineIndex 
ProcessUtilitySlow  
standard_ProcessUtility 
PortalRunUtility    
PortalRunMulti  
PortalRun   
exec_simple_query   
PostgresMain    
BackendRun  
BackendStartup  
PostmasterMain  
main    
start   

My support function is std_index_pipe(s: &str):

#[pg_extern(immutable, strict, parallel_safe)]
pub fn std_index_pipe(input: &str) -> Tokens {
  // make some tokens here
}

Just before I call the function I switch to a custom memory context, call the func, switch back, and reset() the context. I get the leak whether I do that or not.

I ran select * from pg_backend_memory_contexts; to see if there was a context getting filled up. The sum total_bytes across all contexts was far less than the amount of data piling up in memory. This I don't understand; palloc() should do the allocation in some memory context somewhere, right? All this memory does get cleared when the build is complete, which means that it is in a context, I just can't see it. Odd.

Any idea how to track this problem down?

eeeebbbbrrrr commented 11 months ago

Just before I call the function I switch to a custom memory context, call the func, switch back, and reset() the context. I get the leak whether I do that or not.

Can you show us this code?

ZDB takes a similar approach in the build callback function for the same general reason and it's fine.

ccleve commented 11 months ago

I used code that is almost identical to that in ZDB. Literally identical.

On Wed, Oct 11, 2023, 2:47 PM Eric Ridge @.***> wrote:

Just before I call the function I switch to a custom memory context, call the func, switch back, and reset() the context. I get the leak whether I do that or not.

Can you show us this code?

ZDB takes a similar approach in the build callback function for the same general reason and it's fine.

— Reply to this email directly, view it on GitHub https://github.com/pgcentralfoundation/pgrx/issues/1330#issuecomment-1758424604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITHKUH4MJKZ56YDMBWBN3X63ZVVANCNFSM6AAAAAA54OJ4XU . You are receiving this because you authored the thread.Message ID: @.***>

eeeebbbbrrrr commented 11 months ago

I used code that is almost identical to that in ZDB. Literally identical.

I'm that guy that tends to blindly believe what he reads on the internet, but in this case, can you please show us exactly what your code is doing? The code I see when I close my eyes seems to work just fine.

ccleve commented 11 months ago

There's a lot of intervening code. Let me make a small example that reproduces the problem.

On Fri, Oct 13, 2023 at 10:58 AM Eric Ridge @.***> wrote:

I used code that is almost identical to that in ZDB. Literally identical.

I'm that guy that tends to blindly believe what he reads on the internet, but in this case, can you please show us exactly what your code is doing? The code I see when I close my eyes seems to work just fine.

— Reply to this email directly, view it on GitHub https://github.com/pgcentralfoundation/pgrx/issues/1330#issuecomment-1761662008, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITHKU56QFN63BFNEBASF3X7FJI5ANCNFSM6AAAAAA54OJ4XU . You are receiving this because you authored the thread.Message ID: @.***>