fplll / g6k

The General Sieve Kernel
GNU General Public License v2.0
99 stars 30 forks source link

Switch the sign around on compare_QEntry. #117

Open joerowell opened 1 year ago

joerowell commented 1 year ago

What's this for?

At the moment, the BDGL sieve's compare_QEntry code looks like this:

https://github.com/fplll/g6k/blob/72a5c04998f52885700fbe551f3f11dce2b0c7ef/kernel/bdgl_sieve.cpp#L51

This seems to be the wrong way around: as written, it'll sort the queue into descending order, rather than ascending. For example, cppreference has this to say about the comparator function:

comp -- comparison function object (i.e. an object that satisfies the requirements of Compare) 
which returns ​true if the first argument is less than (i.e. is ordered before) the second.

I think this is the wrong way around for what the BDGL code is supposed to do. Indeed, it looks like the assumption is that the queue is sorted in ascending order. For example, the bdgl_queue_create_task function looks like it wants to prioritise inserting shorter vectors, rather than longer ones.

This PR also seems to improve performance (admittedly in low dimensions) compared to the old version. For example, in dimension 90 I get an average reduction in wall time of around 15%, and the same is true for the CPU time. Whilst this might not hold in larger dimensions, it still seems to be a worthwhile change.

New timings:

root: 'threads': 20, 'default_sieve': bdgl,              :: n: 90, cputime 986.2254s, walltime: 55.9656s, |db|: 2^20.35
root: 'threads': 20, 'default_sieve': bdgl,              :: n: 90, cputime 980.4346s, walltime: 55.3039s, |db|: 2^20.35
root: 'threads': 20, 'default_sieve': bdgl,              :: n: 90, cputime 1017.6930s, walltime: 59.3404s, |db|: 2^20.35

Old timings:

root: 'threads': 20, 'default_sieve': bdgl,              :: n: 90, cputime 1149.4413s, walltime: 65.5039s, |db|: 2^20.35
root: 'threads': 20, 'default_sieve': bdgl,              :: n: 90, cputime 1120.3177s, walltime: 64.6482s, |db|: 2^20.35
root: 'threads': 20, 'default_sieve': bdgl,              :: n: 90, cputime 1236.4488s, walltime: 70.0644s, |db|: 2^20.35

Of course, all of this assumes that my interpretation of what the code should do is correct.

malb commented 1 year ago

FWIW: I'm the process of "fixing" the main branch so that PR failures are meaningful again.

malb commented 1 year ago

If you rebase on master CI should be meaningful again

WvanWoerden commented 1 year ago

The change looks good to me (maybe after some more benchmarks).

To give some context: the descending sort was introduced on purpose (but is certainly suboptimal). The thought was to insert as many new vectors as possible, preferably all, to keep the database fresh and reduce collisions. So we start inserting the longest vectors from the queue at the back of the database and work our way forward with shorter vectors, assuming that these vectors are still short enough to replace the vectors earlier in the database.

Swapping the sort around will insert less new vectors, but will keep the best ones from the database and the queue. With an optimal lenbound both strategies should essentially be the same (inserting all queue vectors), but the latter is probably the better strategy once the lenbound deviates from optimal. We might want to run a few more benchmarks to confirm this in higher dimensions.

joerowell commented 1 year ago

Thanks @WvanWoerden ! I'll run some more benchmarks (say, up to dimension 110 or so) and see if the pattern still holds. And thanks for the explanation: I agree that makes sense.

lducas commented 9 months ago

@joerowell What is the status here ?

joerowell commented 9 months ago

I completely forgot about this: I'll upload the benchmark results this afternoon.

On Sat, 2 Dec 2023, 06:14 Léo Ducas, @.***> wrote:

@joerowell https://github.com/joerowell What is the status here ?

— Reply to this email directly, view it on GitHub https://github.com/fplll/g6k/pull/117#issuecomment-1837056607, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF7CASE5W4QLSOMXJWT2FPTYHLBKXAVCNFSM6AAAAAA3LQO4SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZXGA2TMNRQG4 . You are receiving this because you were mentioned.Message ID: @.***>