Sandia-OpenSHMEM / SOS

Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric Interface (OFI), and UCX. Please click on the Wiki tab for help with building and using SOS.
Other
61 stars 53 forks source link

Malloc with varsize (option 1) implementation #1082

Open wrrobin opened 1 year ago

wrrobin commented 1 year ago

This is a test PR for malloc varsize implementation with the following syntax:

void *shmem_malloc_varsize (size_t size, size_t global_max_size);
void *shmem_calloc_varsize (size_t count, size_t global_size, size_t global_max_count);
wrrobin commented 1 year ago

@davidozog @stewartl318 - Does this implementation look correct to you? I still need to check if a subsequent symmetric malloc works without any issues.

stewartl318 commented 1 year ago

The implementation looks right, but I don't like calloc_varsize. I though we decided that size should be the same for all PEs, and global_max_count is the same for all PEs, but that my_count could be different for each PE. I can't imagine any program that would have a different data size on different PEs.

wrrobin commented 1 year ago

@stewartl318 - thanks for your comments. You do have to pass size argument even though it is fixed for all PEs. Otherwise, for example, how would you know if you are allocating for int or for long? This actually brings up a question as well.. we do not have checks for regular malloc/calloc that the sizes for all PEs are same. But, I think that is up to the user anyway. @davidozog - thoughts?

One update is, I do see issues with this code for the subsequent symmetric allocations. I think the reason is dlmalloc internally uses padding depending on the allocation size requested based on their algorithm. I will see how to fix it.

stewartl318 commented 1 year ago

I was imprecise. The proposed function is SHMEM_FUNCTION_ATTRIBUTES void* SHPRE()shmemx_calloc_varsize(size_t my_count, size_t my_size, size_t global_max_count, size_t global_max_size);

My point is that all PEs should have my_size == global_max_size. I cannot think of a counterexample. Therefore the function does not need both.

Does shmem_calloc have a rule that size and count have to be the same on all PEs? Or is the rule that size*count must be the same on all PEs?

wrrobin commented 1 year ago

@stewartl318 - By any chance, are you looking at the first commit only? Please check the latest commit or overall changes. The first commit was done some time back, but I have updated that before opening this PR.

stewartl318 commented 1 year ago

Blush. Yes Wasi, I was not looking at the latest version. I am ok with the newest one!