Closed JieRen98 closed 3 months ago
StarPU does not support it so far. (e.g., https://gitlab.inria.fr/starpu/starpu/-/blob/master/src/drivers/mpi/driver_mpi_common.c?ref_type=heads#L300)
I guess you are using starpu_mpi_task_insert
etc., not the MPI master-slave driver support, so it's rather the MPI datatype definition from mpi/src/starpu_mpi_datatype.c
that you need fixed.
I understand that you have a urgent deadline. Which StarPU data type are you using in your application?
Greetings,
I am using Chameleon actually. I am not completely sure how Chaneleon interact with StarPU. I am going to use multiple precisions (FP64, FP32, FP16, and FP8).
Best, Jie
Samuel Thibault @.***>于2024年4月8日 周一14:25写道:
StarPU does not support it so far. (e.g., https://gitlab.inria.fr/starpu/starpu/-/blob/master/src/drivers/mpi/driver_mpi_common.c?ref_type=heads#L300 )
I guess you are using starpu_mpi_task_insert etc., not the MPI master-slave driver support, so it's rather the MPI datatype definition from mpi/src/starpu_mpi_datatype.c that you need fixed.
I understand that you have a urgent deadline. Which StarPU data type are you using in your application?
— Reply to this email directly, view it on GitHub https://github.com/starpu-runtime/starpu/issues/43#issuecomment-2042500036, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALYD4VWAVU3HZNF6C77IKGTY4J5DRAVCNFSM6AAAAABF4M666CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGUYDAMBTGY . You are receiving this because you authored the thread.Message ID: @.***>
I am using Chameleon actually. I am not completely sure how Chaneleon interact with StarPU
Ok, then I guess you are using a matrix descriptor from Chameleon?
Yes, specifically, my customized descriptor.
Samuel Thibault @.***>于2024年4月8日 周一14:31写道:
I am using Chameleon actually. I am not completely sure how Chaneleon interact with StarPU
Ok, then I guess you are using a matrix descriptor from Chameleon?
— Reply to this email directly, view it on GitHub https://github.com/starpu-runtime/starpu/issues/43#issuecomment-2042510256, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALYD4VWI6VRMSUOL6DU6NSLY4J5ZBAVCNFSM6AAAAABF4M666CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGUYTAMRVGY . You are receiving this because you authored the thread.Message ID: @.***>
How is it customized? Essentially, the question is which starpu_data_something_register
function is getting called in your case.
put another way, do you have any data_register
call beyond starpu_vector_data_register
and starpu_mpi_data_register
?
Is it using starpu_data_register
directly?
Does it use starpu_mpi_interface_datatype_register
?
Sorry I am having my lunch, I will answer your question about 20mins later.
Samuel Thibault @.***>于2024年4月8日 周一14:46写道:
Does it use starpu_mpi_interface_datatype_register?
— Reply to this email directly, view it on GitHub https://github.com/starpu-runtime/starpu/issues/43#issuecomment-2042537377, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALYD4VSCUKNRWHJUHAJVWYDY4J7RXAVCNFSM6AAAAABF4M666CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGUZTOMZXG4 . You are receiving this because you authored the thread.Message ID: @.***>
Is MPI_Type_vector_c
supported by your mpi implementation?
(basically, we would just want to use the _c
variants of the MPI calls that we currently make: MPI_Irecv
, MPI_Isend
, MPI_Issend
, MPI_Type_vector
, MPI_Type_contiguous
, MPI_Type_size
)
Chameleon uses both starpu_data_register
and starpu_mpi_data_register
to register tiles. I believe StarPU does not know what the type is but only the size count in bytes.
Then it must be also using starpu_mpi_interface_datatype_register
to register the mpi type to be used? Otherwise StarPU does not even know the size count in bytes.
(starpu_data_register
alone does not tell starpu the size count in bytes)
Yes, you are right, here is the type registration although I do not understand it completely...:
void
starpu_cham_tile_interface_init()
{
if ( starpu_interface_cham_tile_ops.interfaceid == STARPU_UNKNOWN_INTERFACE_ID )
{
starpu_interface_cham_tile_ops.interfaceid = starpu_data_interface_get_next_id();
#if defined(CHAMELEON_USE_MPI_DATATYPES)
#if defined(HAVE_STARPU_MPI_INTERFACE_DATATYPE_NODE_REGISTER)
starpu_mpi_interface_datatype_node_register( starpu_interface_cham_tile_ops.interfaceid,
cti_allocate_datatype_node,
cti_free_datatype );
#else
starpu_mpi_interface_datatype_register( starpu_interface_cham_tile_ops.interfaceid,
cti_allocate_datatype,
cti_free_datatype );
#endif
#endif
}
}
This shows how Chameleon registers the tile. I thought attributes .allocsize
and .tilesize
would tell StarPU the size.
void
starpu_cham_tile_register( starpu_data_handle_t *handleptr,
int home_node,
CHAM_tile_t *tile,
cham_flttype_t flttype )
{
size_t elemsize = CHAMELEON_Element_Size( flttype );
starpu_cham_tile_interface_t cham_tile_interface =
{
.id = STARPU_CHAM_TILE_INTERFACE_ID,
.flttype = flttype,
.dev_handle = (intptr_t)(tile->mat),
.allocsize = -1,
.tilesize = tile->m * tile->n * elemsize,
};
memcpy( &(cham_tile_interface.tile), tile, sizeof( CHAM_tile_t ) );
/* Overwrite the flttype in case it comes from a data conversion */
cham_tile_interface.tile.flttype = flttype;
if ( tile->format & CHAMELEON_TILE_FULLRANK ) {
cham_tile_interface.allocsize = tile->m * tile->n * elemsize;
}
else if ( tile->format & CHAMELEON_TILE_DESC ) { /* Needed in case starpu ask for it */
cham_tile_interface.allocsize = tile->m * tile->n * elemsize;
}
else if ( tile->format & CHAMELEON_TILE_HMAT ) {
/* For hmat, allocated data will be handled by hmat library. StarPU cannot allocate it for the library */
cham_tile_interface.allocsize = 0;
}
starpu_data_register( handleptr, home_node, &cham_tile_interface, &starpu_interface_cham_tile_ops );
}
Please also show cti_allocate_datatype_node
and cti_allocate_datatype
, that's very most probably where you need a fix
Here you go:
#if defined(CHAMELEON_USE_MPI_DATATYPES)
int
cti_allocate_datatype_node( starpu_data_handle_t handle,
unsigned node,
MPI_Datatype *datatype )
{
int ret;
starpu_cham_tile_interface_t *cham_tile_interface = (starpu_cham_tile_interface_t *)
starpu_data_get_interface_on_node( handle, node );
size_t m = cham_tile_interface->tile.m;
size_t n = cham_tile_interface->tile.n;
size_t ld = cham_tile_interface->tile.ld;
size_t elemsize = CHAMELEON_Element_Size( cham_tile_interface->flttype );
ret = MPI_Type_vector( n, m * elemsize, ld * elemsize, MPI_BYTE, datatype );
STARPU_ASSERT_MSG(ret == MPI_SUCCESS, "MPI_Type_vector failed");
ret = MPI_Type_commit( datatype );
STARPU_ASSERT_MSG(ret == MPI_SUCCESS, "MPI_Type_commit failed");
return 0;
}
int
cti_allocate_datatype( starpu_data_handle_t handle,
MPI_Datatype *datatype )
{
return cti_allocate_datatype_node( handle, STARPU_MAIN_RAM, datatype );
}
void
cti_free_datatype( MPI_Datatype *datatype )
{
MPI_Type_free( datatype );
}
#endif
Also, again,
Is MPI_Type_vector_c supported by your mpi implementation?
MPI_Type_vector_c
I didn't find any line includes MPI_Type_vector_c
Here you go:
ret = MPI_Type_vector( n, m * elemsize, ld * elemsize, MPI_BYTE, datatype );
That's it: you want to use MPI_Type_vector_c
instead. StarPU just MPI_Sends one of this type.
MPI_Type_vector_c
I didn't find any line includes
MPI_Type_vector_c
Where did you not find it?
Put another way: which MPI implementation are you using?
I see that notably openmpi doesn't seem to be providing the _c
variants, so in that case you need to use a for loop to make the series of MPI_Type_vector
calls
Here you go:
ret = MPI_Type_vector( n, m * elemsize, ld * elemsize, MPI_BYTE, datatype );
That's it: you want to use
MPI_Type_vector_c
instead. StarPU just MPI_Sends one of this type.
Ok, Chameleon uses byte
and the leading dim becomes ld * sizeof(t)
, that's fine I guess. So do you mean I should change this to MPI_Type_vector_c to support large integers?
If your MPI implementation supports MPI_Type_vector_c
, that's the simplest, yes. If not, you need to use a for loop to describe the data type piece by piece.
MPI_Type_vector_c
I didn't find any line includes
MPI_Type_vector_c
Where did you not find it?
Put another way: which MPI implementation are you using?
I am using mpich, I mean Chameleon does not use MPI_Type_vector_c
, not my MPI does not have this.
So in the end it's the chameleon code that needs fixing. StarPU will however want to do the same for its predefined vector/matrix/etc. types, so keeping this issue open for that.
I am using mpich
mpich does have MPI_Type_vector_c
, so you can simply add a _c
to the MPI_Type_vector
call, and that should work.
(mpich does so apparently since its version 4)
Thanks a lot! I was reading StarPU, but it seems that I did not do well. So starpu_mpi_interface_datatype_node_register
is the most important thing. While the starpu_data_register
and starpu_mpi_data_register
calls have no effect on the integer overflow, if allocate_datatype_func
is correctly defined (using MPI_Type_vector_c
) in calling starpu_mpi_interface_datatype_node_register
?
yes, that's the idea. For application-defined interfaces it's starpu_mpi_interface_datatype_node_register
that tells starpu how to send the data over MPI
Thanks a lot, you helped a lot!
Is your feature request related to a problem? Please describe. Since MPI uses int32 as the data count, see as follows
When we want to send a larger buffer (count > INT_MAX), we need to split the buffer into several chunks and send them one by one. However, StarPU does not support it so far. (e.g., https://gitlab.inria.fr/starpu/starpu/-/blob/master/src/drivers/mpi/driver_mpi_common.c?ref_type=heads#L300)
Describe the solution you'd like Split the buffer into several chunks:
Functions' signatures (e.g., __starpu_mpi_common_send_to_device, __starpu_mpi_common_send, etc) should be changed correspondingly.
Describe alternatives you've considered N/A
Additional context N/A