yosupo06 / library-checker-problems

The problem data (Test case generator, judge's solution, task, ...) of Library Checker
https://judge.yosupo.jp/
Apache License 2.0
533 stars 123 forks source link

[Problem proposal] Integer Sort #1280

Open adamant-pwn opened 2 weeks ago

adamant-pwn commented 2 weeks ago

Problem name: Integer Sort (Optional) Problem ID: sort_integers

Problem

Given $n$ integers $A_1,\dots,A_n$, sort them.

Constraint

Solution / Reference

(Optional) Input

n
A1 ... An

(Optional) Output

B1 ... Bn

where $B_1 \leq \dots \leq B_n$.

(Optional) Note / Disucussion

adamant-pwn commented 2 weeks ago

For generic comparison-based sort, we also already have Sort Points by Argument, but the constraints seem not large enough for it to be a nice sorting benchmark :thinking:

adamant-pwn commented 2 weeks ago

I also wonder whether it's best to just give one array, or a lot of arrays and guarantee their total size is up to $10^7$? So that both the case of one large array, and a lot of small arrays are covered.

EarthMessenger commented 2 weeks ago

In my benchmarks, a small constant radix sort can sort a random array of length $10^7$ in just 0.08s. However, when I add input and output (using library_checker's fastio), the total time increases to 0.27s. This shows that reading n integers becomes a bottleneck.

Theoretically, The complexity of reading $N$ integers is $O(n \log V)$.

Code (the radix sort implementation is adapted from platelet's code):

#include <algorithm>
#include <boost/timer/timer.hpp> // for accurate cpu time
#include <iostream>

#include "fastio.hpp" // library checker's 

constexpr int N = 10'000'000;

unsigned int a[N], b[N];

void sort(unsigned int* a, int __n) {
    unsigned int n = __n;
    unsigned int buc[4][256] = {};
    for (unsigned int i = 0; i < n; i++) {
        buc[0][a[i] & 255]++;
        buc[1][a[i] >> 8 & 255]++;
        buc[2][a[i] >> 16 & 255]++;
        buc[3][a[i] >> 24]++;
    }
    for (unsigned int k = 0; k < 4; k++) {
        unsigned int offset = 0;
        for (unsigned int i = 0; i < 256; i++)
            std::swap(buc[k][i], offset), offset += buc[k][i];
    }
    for (unsigned int i = 0; i < n; i++) b[buc[0][a[i] & 255]++] = a[i];
    for (unsigned int i = 0; i < n; i++) a[buc[1][b[i] >> 8 & 255]++] = b[i];
    for (unsigned int i = 0; i < n; i++) b[buc[2][a[i] >> 16 & 255]++] = a[i];
    for (unsigned int i = 0; i < n; i++) a[buc[3][b[i] >> 24 & 255]++] = b[i];
}

int main()
{
  library_checker::Scanner sc(stdin);
  library_checker::Printer pr(stdout);

  int n = 0;
  sc.read(n);

  for (int i = 0; i < n; i++) sc.read(a[i]);

  {
    boost::timer::auto_cpu_timer t(std::cerr);
    sort(a, n);
  }

  for (int i = 0; i < n; i++) {
    if (i) pr.write(' ');
    pr.write(a[i]);
  }
  pr.writeln();

  return 0;
}

result:

/tmp/tmp.XN9SL063Zy via 🐍 v3.12.7 
❯ python gen.py > in.txt 

/tmp/tmp.XN9SL063Zy via 🐍 v3.12.7 took 4s 
❯ ls -lh in.txt 
-rw-r--r-- 1 emsger emsger 103M 11月 15 16:06 in.txt

/tmp/tmp.XN9SL063Zy via 🐍 v3.12.7 
❯ time ./test < in.txt > /dev/null 
 0.090000s wall, 0.080000s user + 0.000000s system = 0.080000s CPU (88.9%)

real    0m0.313s
user    0m0.269s
sys     0m0.042s
adamant-pwn commented 2 weeks ago

Good point. Might be worthwhile to give input as a generator, similar to KSMALL? But then it might be difficult to add special cases, like breaking certain quicksorts, or giving already sorted array...

maspypy commented 1 week ago

Regarding output, if necessary, it is possible to output an appropriate hash. As for input, aside from randomly generated methods, it is unavoidable that near-fastest submissions will require handling of input using SIMD.

As an alternative, we can offer C++ (Function) style submissions (e.g., https://judge.yosupo.jp/problem/convolution_mod). In this case, it is possible to compare execution times excluding input and output among submissions of this language.