Open benkj opened 2 years ago
In fact we discussed this just a few hours ago with @AntoineRestivo. We can certainly take inspiration from https://arxiv.org/abs/1904.06229
Otherwise the case of the tensor permanent may give even more useful results.
Great, that's exactly what I had in mind.
Il gio 17 feb 2022, 20:48 Benoît Seron @.***> ha scritto:
In fact we discussed this just a few hours ago with @AntoineRestivo https://github.com/AntoineRestivo. We can certainly take inspiration from https://arxiv.org/abs/1904.06229
Otherwise the case of the tensor permanent may give even more useful results.
— Reply to this email directly, view it on GitHub https://github.com/benoitseron/Permanents.jl/issues/6#issuecomment-1043355868, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7GTCTH3EUB3WUHBXNZUDTU3VGKJANCNFSM5OVOS63A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
I'm starting to play with multi-threading, with the long term goal of using the GPU for computing the permanent. So far I've only constructed a "naive" Ryser algorithm without the gray ordering
this code has complexity O(2^n n^2) rather than O(2^n n), but in my machine with a 30x30 matrix it is "only" 8 times slower (rather than 30 times). This shows the advantage of multi-threading.
To make a proper code with proper scaling for computing the permanent, we should use gray ordering. The difficulty is to parallelize the sum over gray sequences. One possibility would be to divide the sum over gray patterns into a fixed number of batches, where the sums in the different batches are done by different threads. In each thread, we can use the same code of
ryser
, but with a proper initial value ofv
. However, computing the initial value ofv
might be complex and kill all the benefits of multi-threading. Other ideas?