pgcentralfoundation / pgrx

Build Postgres Extensions with Rust!
Other
3.54k stars 236 forks source link

Could not start server; shmem too large #1351

Open kczimm opened 10 months ago

kczimm commented 10 months ago

I'm having trouble starting the server when the size of shared memory is beyond a certain size (not particularly large). If I take the shmem example and simply increase the size of the Deque to ~8MB, the server fails to start.

static DEQUE: PgLwLock<heapless::Deque<Pgtest, 1_000_000>> = PgLwLock::new();

If I set the log level to debug5, the log says the following:

2023-10-24 16:14:10.031 UTC [3966800] DEBUG:  find_in_dynamic_libpath: trying "/home/ubuntu/.pgrx/15.4/pgrx-install/lib/postgresql/shmem"
2023-10-24 16:14:10.031 UTC [3966800] DEBUG:  find_in_dynamic_libpath: trying "/home/ubuntu/.pgrx/15.4/pgrx-install/lib/postgresql/shmem.so"
2023-10-24 16:14:10.031 UTC [3966800] DEBUG:  loaded library "shmem"
2023-10-24 16:14:10.031 UTC [3966800] DEBUG:  invoking IpcMemoryCreate(size=1129136128)
2023-10-24 16:14:10.031 UTC [3966800] DEBUG:  mmap(1130364928) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2023-10-24 16:14:10.052 UTC [3966800] DEBUG:  dynamic shared memory system will support 224 segments
2023-10-24 16:14:10.052 UTC [3966800] DEBUG:  created dynamic shared memory control segment 2541292408 (8976 bytes)

Note

I used the feature flag pg15 simply because I already had my postgresql.conf setup for it. I've tried experimenting with shared_buffers size but like I said, this isn't even that large of shared memory. My kernel.shmmax and kernel.shmall settings are much larger than what I'm requesting also.

If anyone knows what is going on here, it would be much appreciated.

kczimm commented 10 months ago

After further experimentation, I enabled hugepages with sysctl which made that debug print disappear from the log, but the server still fails to start.

kczimm commented 10 months ago

https://kaiwantech.wordpress.com/2011/08/17/kmalloc-and-vmalloc-linux-kernel-memory-allocation-api-limits/

It's possible that the maximum single shared memory allocation is 4MB. This is consistent with the sizes that have worked and have not worked. Also, I tested splitting a single allocation that failed into 10 equal sized allocations and that worked, thus indicating it's not the total size but the maximum size of a single allocation.

workingjubilee commented 10 months ago

That is simultaneously surprising small and also not very surprising at all! As I understand things, shmem is meant to be used for IPC, which typically uses a few KiB, it's not really for holding large quantities of data.

Unfortunately it would be erroneous, probably, to issue a compile-time warning, even if we arranged for detecting this, because this depends on page size which isn't uniform between systems, not even for a given architecture.

jamessewell commented 10 months ago

shared_buffers are 100% for holding large amounts of data (if that's what you want to do) .

Interestingly it works with --release, so I guess this is a Rust thing?

On Wed, Oct 25, 2023 at 1:47 PM Jubilee @.***> wrote:

That is simultaneously surprising small and also not very surprising at all! As I understand things, shmem is meant to be used for IPC, which typically uses a few KiB, it's not really for holding large quantities of data.

Unfortunately it would be erroneous, probably, to issue a compile-time warning, even if we arranged for detecting this, because this depends on page size which isn't uniform between systems, not even for a given architecture.

— Reply to this email directly, view it on GitHub https://github.com/pgcentralfoundation/pgrx/issues/1351#issuecomment-1778294124, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJJDI3J3OZRMO72JT4ZT2DYBBORZAVCNFSM6AAAAAA6N64OTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZYGI4TIMJSGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

workingjubilee commented 10 months ago

I mean I would have expected using something else for transferring bigger allocations, but:

Really, it works with --release? What? Even after actually doing some writes so the compiler can't optimize things away? That's weird.