davidsd / sdpb

A semidefinite program solver for the conformal bootstrap.
MIT License
53 stars 42 forks source link

pmp2sdp fails for 2 input files with objectives when running on 1 core #251

Closed alepiazza closed 2 months ago

alepiazza commented 2 months ago

I am experiencing errors when using pmp2sdp with multiple input files (specified with the .nsv format) running on a single core. I am using version 3.0.0 in Docker.

For example, I split the PMP in mathematica/Test.m into two files as follows

testSDPMatrixMulti[xmlFile1_,xmlFile2_] := Module[
    {
        pols = {
            PositiveMatrixWithPrefactor[
                DampedRational[1, {}, 1/E, x],
                {{{1 + x^4, 1 + x^4/12 + x^2}, {x^2,     x/5}},
                 {{x^2,     x/5},              {2 + x^4, x^4/3 + 2*x^2}}}],
            PositiveMatrixWithPrefactor[
                DampedRational[1, {}, 1/E, x],
                {{{1 + 3x^4/4, 1 + x^4/12 + x^2}, {x^2,     1/2 + x/5}},
                 {{x^2,     1/2 + x/5},        {2 + 3x^4/5, x^4/3 + 2*x^2}}}]
        },
        norm = {1, 0},
        obj  = {0, -1}
    },

    WritePmpXml[xmlFile1, SDP[obj, norm, pols[[{2}]]]];
    WritePmpXml[xmlFile2, SDP[obj, norm, pols[[{1}]]]];
];
testSDPMatrixMulti["test1.xml","test2.xml"] 

and create the file file_list.nsv with the above file names separated by null char.

Then I obtain the following error

~ $ mpirun -n 1 pmp2sdp --precision=200 -v 3 --input file_list.nsv -f json  --output multi.sdp
pmp2sdp --precision=200 -v 3 --input file_list.nsv -f json --output multi.sdp 
start pmp2sdp --- MemTotal: 7.34351 GB
start pmp2sdp --- MemUsed: 4.52727 GB
start pmp2sdp.read_pmp --- MemUsed: 4.52727 GB
start pmp2sdp.read_pmp.parse --- MemUsed: 4.52727 GB
start pmp2sdp.read_pmp.parse.file_1=test1.xml --- MemUsed: 4.52727 GB
start pmp2sdp.read_pmp.parse.file_0=test2.xml --- MemUsed: 4.52727 GB
start pmp2sdp.read_pmp.sync_num_matrices --- MemUsed: 4.52727 GB
Error: in read_polynomial_matrix_program() at ../src/pmp_read/read_polynomial_matrix_program.cxx:193: 
  Assertion 'objective.empty()' failed:
    objective already read from another file: duplicate found at "/root/test1.xml"
Stacktrace:
 0# 0x000055DBB8C9E169 in /usr/local/bin/pmp2sdp
 1# 0x000055DBB8CD1EE3 in /usr/local/bin/pmp2sdp
 2# 0x000055DBB8CB4D9A in /usr/local/bin/pmp2sdp
 3# 0x00007FA002D896D1 in /lib/ld-musl-x86_64.so.1

max MemUsed: 4.52727 GB at "pmp2sdp"
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

Deleting by hand the <objective>...</objective> part in e.g. test2.xml solves the issue but this is not the behaviour claimed in https://github.com/davidsd/sdpb/blob/master/docs/Usage.md#converting-pmp-to-sdp.

The issue is not present instead when running on multiple cores

~ $ mpirun -n 2 pmp2sdp --precision=200 -v 3 --input file_list.nsv -f json  --output multi.sdp
pmp2sdp --precision=200 -v 3 --input file_list.nsv -f json --output multi.sdp 
start pmp2sdp --- MemTotal: 7.34351 GB
start pmp2sdp --- MemUsed: 4.57627 GB
start pmp2sdp.read_pmp --- MemUsed: 4.57627 GB
start pmp2sdp.read_pmp.parse --- MemUsed: 4.57627 GB
start pmp2sdp.read_pmp.parse.file_1=test1.xml --- MemUsed: 4.57627 GB
start pmp2sdp.read_pmp.sync_num_matrices --- MemUsed: 4.57627 GB
start pmp2sdp.read_pmp.sync_objective_normalization --- MemUsed: 4.57627 GB
start pmp2sdp.convert --- MemUsed: 4.57627 GB
start pmp2sdp.convert.matrices --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.clear_output_paths --- MemUsed: 4.57627 GB
Warning: Output path "multi.sdp" exists and will be overwritten.
start pmp2sdp.write_sdp.clear_output_paths.mpi_barrier --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.block_files --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.block_files.block_info_1 --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.block_files.block_data_1 --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.mpi_reduce_block_sizes --- MemUsed: 4.57627 GB
---------------------
Matrix sizes and RAM estimates:
---------------------
BigFloat, bytes: 72
P (primal objective): 30 elements, 2.1 KB (2160 bytes)
N (dual objective): 1 elements, 72 B
B matrix (PxN) - free_var_matrix, schur_off_diagonal: 30 elements, 2.1 KB (2160 bytes)
Q matrix (NxN): 1 elements, 72 B
Bilinear bases: 50 elements, 3.5 KB (3600 bytes)
Bilinear pairing blocks - A_x_inv, A_y: 400 elements, 28.1 KB (28800 bytes)
PSD blocks - X, Y, primal_residues, X_chol, Y_chol, dX, dY, XY, R, Z: 104 elements, 7.3 KB (7488 bytes)
Schur (PxP block diagonal) - schur_complement, schur_complement_cholesky: 450 elements, 31.6 KB (32400 bytes)
Total (without shared windows) = 2#(B) + 10#(PSD) + 2#(S) + 2#(Bilinear pairing) + #(Q): 2697 elements, 189.6 KB (194184 bytes)
NB: in addition to that, SDPB will allocate shared memory windows (for calculating Q), within --maxSharedMemory limit.
---------------------
start pmp2sdp.write_sdp.check_sdp_dir --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.check_sdp_dir.file_count --- MemUsed: 4.57627 GB
start pmp2sdp.write_sdp.check_sdp_dir.file_sizes --- MemUsed: 4.57627 GB
Processed 2 SDP blocks in 0.019 seconds, output: multi.sdp
max MemUsed: 4.57627 GB at "pmp2sdp"

I have checked that in this case, the .sdp output is identical to the one I can obtain from a single XML produced with the original testSDPMatrix function.

Moreover, in this second case, the debug output is a bit misleading because it seems that only one of the two XML files is parsed while in practice the output .sdp contains the conditions of both files (the files in the .sdp folders are identical to the ones I obtain when running pmp2sdp with a single XML produced with the original testSDPMatrix function).

It seems a similar behaviour happens more generically when the number of XML files is strictly bigger than the number of cores but I haven't done systematic testing of this.

vasdommes commented 2 months ago

@alepiazza thanks for reporting! I've fixed the bug in #252, the fix is available in the latest Docker image sdpb:master.

P.S. Now we recommend using JSON format instead of XML (which is left only for backward compatibility). Replacing WritePmpXml -> WritePmpJson, test1.xml -> test1.json etc. should be enough for switching to JSON.