Open methylDragon opened 2 years ago
Is there a case where a user might want to have multiple sub-nodes of a similar architecture? (e.g. two GPUs, swapping between them.)
It doesn't look like this adaptive component is able to handle that case, since it manages swapping of architectures only. Is this intended?
Only a subset of the compute substrate possibilities was considered when prototyping this out, but I'd say we definitely want extensibility down the road with other devices including new vector processors, etc.
Right now this is restricted to https://github.com/ros-acceleration/adaptive_component/blob/main/include/adaptive_component.hpp#L55-L58. I can't come up right now with a use case involving two different GPU implementations that you'd want within the same adaptive Node (let me know if you do), but I believe considering future extensions and/or alternatives makes sense (e.g. adding an Others
option to the enum).
Another way to do it is to have the hardware specific node variables be a vector of pointers instead.
I think you're right that there will be very few situations where you'd need two different GPU implementations. (The only one I can think of is if you happen to know that one of the GPU architectures performs better than the other for a specific task, and vice versa, then depending on the task the node is focused on, the user might want to be swapping between them.)
Adding the Others option to the enum makes sense though.
Another way to do it is to have the hardware specific node variables be a vector of pointers instead.
I like this idea, and it would be a nice contribution that would scale across future (not currently considered) compute architectures.
For now I'd suggest minimally adding the Other option so people can potentially use it for another GPU/etc. if they need to. Do you think that'll work?
That will do for now, yeah.
Edit: I'll actually give it a further thought and try improving the implementation in future releases with a more synthetic and compute-agnostic implementation. Something along the lines of:
// TODO: More synthetic implementation
// re-engage if compute options start exploding,
// needs to properly consider bounds and avoid nullptr issues.
//
// Remove Nodes in current hardware
if (compute_resources_[adaptive_value_])
exec_.remove_node(compute_resources_[adaptive_value_]);
// Add new nodes
if (compute_resources_[new_adaptive_value_])
exec_.add_node(compute_resources_[new_adaptive_value_]);
else
RCLCPP_ERROR(this->get_logger(),
"No Node available for computational resource: %d (%s)",
new_adaptive_value_,
compute_resources_names_[new_adaptive_value_]
);
Leaving this open as TODO for future releases.
Is there a case where a user might want to have multiple sub-nodes of a similar architecture? (e.g. two GPUs, swapping between them.)
It doesn't look like this adaptive component is able to handle that case, since it manages swapping of architectures only. Is this intended?