rapidsai / rmm

RAPIDS Memory Manager
https://docs.rapids.ai/api/rmm/stable/
Apache License 2.0
452 stars 190 forks source link

[FEA] Add a argument or flag to control whether the constructor and operator= should be async #557

Open JustPlay opened 3 years ago

JustPlay commented 3 years ago

Add a argument or flag to control whether the constructor and operator= should be async

https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_buffer.hpp#L118

https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_buffer.hpp#L198

https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_buffer.hpp#L438

the copy() is async and the above ctor and operator= do not wait for the copy to be done, so the caller may need to sync t he stream explicitly (e.g. if the caller operate on more than two different stream(s))

I think it's better to add a extra parameter or flag to indicate whether we need a sync patten,and we can give it a default argument so no code need to be changed for RMM user (e.g. cuDF and rapids)

JustPlay commented 3 years ago

https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_scalar.hpp#L127

https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_scalar.hpp#L154

https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_scalar.hpp#L190

Why the set operator is async by default? (i thinks this put too much work onto the caller side)

harrism commented 3 years ago

I think the intended design is that everything is async and stream-ordered. A user is free to add a synchronous wrapper if needed, but in general in RAPIDS we need asynchronicity.

You could use a thrust::device_vector but you may find, like we did, that the convenience it provides comes at a performance cost.

jrhemstad commented 3 years ago

I think the intended design is that everything is async and stream-ordered. A user is free to add a synchronous wrapper if needed, but in general in RAPIDS we need asynchronicity.

You could use a thrust::device_vector but you may find, like we did, that the convenience it provides comes at a performance cost.

I think the original issue brings up valid points. There's a lot of things I don't like about device_buffer that I fixed in device_uvector:

I would like to make similar changes to device_buffer.

JustPlay commented 3 years ago

https://github.com/rapidsai/rmm/issues/570

github-actions[bot] commented 3 years ago

This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] commented 3 years ago

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

harrism commented 3 years ago

I think the original issue brings up valid points. There's a lot of things I don't like about device_buffer that I fixed in device_uvector:

1. There is no constructor to copy from arbitrary host/device data (only from another `device_uvector`)

  * I don't like controlling the (a)synchrony of a constructor with a parameter. Instead, I would make the constructor synchronous and use a factory for the async version, e.g., `auto buff = device_buffer::make_async(n, stream)`.

2. `operator=` is deleted

3. Modifiers are split based on asynchrony, e.g., [`set_element`](https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_uvector.hpp#L193) and [`set_element_async`](https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_uvector.hpp#L230)

4. No default stream arguments

I would like to make similar changes to device_buffer.

775 addresses #2 and #4 in Jake's list. It deletes the copy assignment operator and copy constructor, and requires an explicit stream everywhere possible (e.g. resize() and shrink_to_fit()).

@jrhemstad are you suggesting we remove this constructor? https://github.com/rapidsai/rmm/blob/0cc1380523d31b2f044ecc74b457b228f8aea0c8/include/rmm/device_buffer.hpp#L124-L132

And when you say "I would make the constructor synchronous ", you don't mean to synchronize the specified stream, do you?

harrism commented 3 years ago

@jrhemstad are you suggesting we remove this constructor?

@jrhemstad can you comment? I'd like to make progress on closing or implementing this FEA.

jrhemstad commented 3 years ago

@jrhemstad are you suggesting we remove this constructor?

@jrhemstad can you comment? I'd like to make progress on closing or implementing this FEA.

I think I meant making that constructor synchronous (sync the specified stream) and have an async factory.

harrism commented 3 years ago

Wouldn't an sync factory need to call an async constructor?

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.