diku-dk / futhark

:boom::computer::boom: A data-parallel functional programming language
http://futhark-lang.org
ISC License
2.35k stars 164 forks source link

Limit context memory use #2112

Open FluxusMagna opened 4 months ago

FluxusMagna commented 4 months ago

Futhark programs can eat a lot of memory, sometimes more than they need. When running programs that iterate using futhark functions with dynamic allocations, the futhark caches can fill the device memory, even though it doesn't actually need that much. This is perhaps not a problem for the context itself, as it can reuse its already allocated memory, but this is problematic if the futhark program is not the only program running on the device, as there is no free device memory to allocate to new processes. Perhaps it would make sense to have a context configuration option that limits total memory use for the context. I don't know if the limit should be 'soft' (so that allocations that need to be that large are allowed, and only cache is cleared) or 'hard' (so that exceeding the limit results in out of memory errors). Perhaps there are other, better ways of limiting the memory use, but assuming reusing preallocated memory is beneficial, I think this might be a decent compromise between performance and minimizing memory footprint.

Manual clearing of caches is of course possible, but that has to be done by the program using the futhark context, which presumably doesn't know how much memory is being used.

athas commented 4 months ago

I'm not completely sure what you are asking for. There are two ways you can limit memory:

  1. Put an upper bound on how much free memory the Futhark runtime system is allowed to keep in the free list.
  2. Put an upper bound on how much memory Futhark is allowed to allocate in total.

Option (1) means a Futhark program can still eat up all the system memory, but option (2) means programs might unexpectedly fail.

I guess both are useful in different circumstances.

FluxusMagna commented 4 months ago

I was primarily thinking of total memory, but for my purposes either would be sufficient. If I could only choose one of them, 1 seems more flexible, in the sense that the same setting could be used for any program without causing any crashes. Option 2 does seem attractive in the context of sharing hardware though.

I'm also a bit curious how limiting the memory use could affect performance if there is sufficient L3-cache to store all necessary data. Could limiting the total memory use to below the size of the cache actually improve performance by reducing cache misses?

athas commented 4 months ago

If you limit the total memory usage, it just means more programs will fail with OOM errors. It will not affect the performance of programs that actually run. The only point where Futhark makes decisions based on memory availability is when it manages the free list.

athas commented 4 months ago

I propose two new C API functions:

void futhark_context_config_set_max_memory_unused(const char *space, int64_t bytes);
void futhark_context_config_set_max_memory_usage(const char* space, int64_t bytes);

The former function limits how many unused bytes are allowed to be in the free list(s), and the latter how many may be allocated total (actually there will be some fine print regarding CPU memory in particular, as that is difficult to count when stack-allocating). The space argument is used to indicate which memory space we are considering (e.g. you can limit GPU memory without limiting CPU memory). We could ditch that for simplicity, and then the limit would apply generally to all memory spaces.