pc2 / Aurora-HLS

Ready-to-link, packaged Aurora IP on four QSFP28 lanes, providing 100Gb/s throughput
Apache License 2.0
7 stars 3 forks source link

add AuroraDuo class #14

Closed papeg closed 5 months ago

papeg commented 5 months ago

Fixes #11

Mellich commented 5 months ago

I must say, I am not that happy with the direction the interface is going. It seems very specialized for our exact use case. In many scenarios, the AuroraDuo class could mostly be replaced by two Aurora constructors. What about 3 or 4 QSFP ports? What if you don't want to use MPI in your host code or don't even have it installed on the machine? Actually, I would be in favor of a more general constructor instead, that allows to directly pass a xrt::kernel instance instead of an index. With that, the user can also specify the kernel name, since it highly depends on the name given in the link script of the user bitstream. I created an issue for that (#18).

papeg commented 5 months ago

The idea was to have one function for checking the status of all cores. But maybe this could also be made with a more generalized version of the check_status_global of the Aurora class. I am not sure if its possible right now with using 2 Aurora constructors. And i would like to support this use case too, as right now most FPGAs will have 2 QSFP ports.

Not being able to build this without MPI is another issue, i think it could be solved by checking for the MPI_VERSION define or similar.

Mellich commented 5 months ago

Okay, I see the MPI dependency was already there before this PR, so maybe this is a different topic. I guess additional functionality does not hurt, but we should also keep the interface simple for maintainability. Wouldn't this code do the trick?

std::vector<Aurora> a;
a.emplace_back(0, dev, uuid);
a.emplace_back(1, dev, uuid);
int errors = 0;
for (auto& c : a) errors += c.check_core_status(3000);
return errors;

For MPI support, add an MPI_Allreduce after the for loop.

papeg commented 5 months ago

I think the best option would be to remove the _global function completely. Right now it also prints the information which core has an error, because it only returns the status itself. Then it would be up to the user to handle this information, which is the cleaner approach. I will prepare a new merge request with the new constructor for solving #18 and where this change in, together with the adapted test code.