HSAFoundation / HSA-Runtime-AMD

The HSA-Runtime
Other
48 stars 16 forks source link

concurrent pkt submission failed #7

Closed humasama closed 9 years ago

humasama commented 9 years ago

I followed the example 'concurrent packet submissions' from the file on page 29, but it didn''t work. I can only submit one packet to the queue using the function 'void enqueue(hsa_queue_t* queue)'.

How to submit concurrent packets to a queue ?

the code is :

hsa_signal_t sig; hsa_signal_create(2, 0, NULL, &sig); hsa_dispatch_packet_t* base_addr = (hsa_dispatch_packet_t*)queue->base_address;

int i = 0;
for(; i < 2; i++){
    uint64_t pkt_id =hsa_queue_add_write_index_relaxed(queue, 1);
    printf("pkt_id = %lu\n", pkt_id);
    while (pkt_id - hsa_queue_load_read_index_acquire(queue) >= queue->size){}

    hsa_dispatch_packet_t* pkt = 
                                       base_addr + (pkt_id % queue->size) *sizeof(hsa_dispatch_packet_t);

    memset(pkt, 0, sizeof(hsa_dispatch_packet_t));
    pkt->header.barrier = 1;
    pkt->header.acquire_fence_scope = 2;
    pkt->header.release_fence_scope = 2;
    pkt->dimensions = 1;
    pkt->workgroup_size_x = WORKGROUP_X;
    pkt->workgroup_size_y = 1;
    pkt->workgroup_size_z = 1;
    pkt->grid_size_x = GRID_X;
    pkt->grid_size_y = 1;
    pkt->grid_size_z = 1;
    dispatch_packet->group_segment_size = 0;
    dispatch_packet->private_segment_size = 0;
    dispatch_packet->completion_signal = sig

    pkt->kernel_object_address = hsaCodeDescriptor->code.handle;
    pkt->kernarg_address = 0;       

    packet_type_store_release(&pkt->header, HSA_PACKET_TYPE_DISPATCH);  
    hsa_signal_store_relaxed(queue->doorbell_signal, pkt_id);

} hsa_signal_wait_acquire(sig, HSA_EQ, 0, (uint64_t) -1, HSA_WAIT_EXPECTANCY_UNKNOWN);

jedwards-AMD commented 9 years ago

I suggest using one of the hsa_queue_add_writeindex(relaxed|acquire|release|acq_rel)(queue, num_packets) APIs to atomically acquire a range of packets. This API return the current write index but atomically increments the write index num_packets. The packets between the returned index and index + num_packets will be available for writing by that thread. However, the queue will have to be of type HSA_QUEUE_TYPE_MULTI to support out of order doorbell rings.

From: humasama [mailto:notifications@github.com] Sent: Tuesday, October 07, 2014 2:52 AM To: HSAFoundation/HSA-Runtime-AMD Subject: [HSA-Runtime-AMD] concurrent pkt submission failed (#7)

I followed the example 'concurrent packet submissions' from the file on page 29, but it didn''t work. I can only submit one packet to the queue using the function 'void enqueue(hsa_queue_t* queue)'.

How to submit concurrent packets to a queue ?

the code is :

hsa_signal_t sig; hsa_signal_create(2, 0, NULL, &sig); hsa_dispatch_packet_t* base_addr = (hsa_dispatch_packet_t*)queue->base_address;

int i = 0;

for(; i < 2; i++){

uint64_t pkt_id =hsa_queue_add_write_index_relaxed(queue, 1);

printf("pkt_id = %lu\n", pkt_id);

while (pkt_id - hsa_queue_load_read_index_acquire(queue) >= queue->size){}

hsa_dispatch_packet_t* pkt =

                                   base_addr + (pkt_id % queue->size) *sizeof(hsa_dispatch_packet_t);

memset(pkt, 0, sizeof(hsa_dispatch_packet_t));

pkt->header.barrier = 1;

pkt->header.acquire_fence_scope = 2;

pkt->header.release_fence_scope = 2;

pkt->dimensions = 1;

pkt->workgroup_size_x = WORKGROUP_X;

pkt->workgroup_size_y = 1;

pkt->workgroup_size_z = 1;

pkt->grid_size_x = GRID_X;

pkt->grid_size_y = 1;

pkt->grid_size_z = 1;

dispatch_packet->group_segment_size = 0;

dispatch_packet->private_segment_size = 0;

dispatch_packet->completion_signal = sig

pkt->kernel_object_address = hsaCodeDescriptor->code.handle;

pkt->kernarg_address = 0;

packet_type_store_release(&pkt->header, HSA_PACKET_TYPE_DISPATCH);

hsa_signal_store_relaxed(queue->doorbell_signal, pkt_id);

} hsa_signal_wait_acquire(sig, HSA_EQ, 0, (uint64_t) -1, HSA_WAIT_EXPECTANCY_UNKNOWN);

— Reply to this email directly or view it on GitHubhttps://github.com/HSAFoundation/HSA-Runtime-AMD/issues/7.