Open wenkaier opened 9 months ago
Thank you very much and endorse your suggestion, I'll try the python version first to see if the performance is up to snuff, and then see if it's necessary to develop a cuda version.
Yes, that makes sense. I'd also start with Python. It would be also nice to benchmark performance in tests, I suspect for some number of people U, any environment with 2N<U people will see a performance downgrade because of the overhead. But for large U multiprocessing proposals and responses should see the benefits. I appreciate your effort, thank you very much. I am assigning you to the issue and re-labeling as enhancement.
As of now, my implementation of the algorithm does not incorporate CUDA acceleration. My decision was guided by the well-known advice from Donald Knuth: Premature optimization is the root of all evil. However, it is entirely feasible to introduce such acceleration by parallelizing the proposal and rejection processes and handling them in batches.