HISKP-LQCD / sLapH-projection-NG

2 stars 0 forks source link

Analytic projection for three pions (I = 3) #22

Open martin-ueding opened 5 years ago

martin-ueding commented 5 years ago

For the three pion project we need to have the projections for three pions in I = 3 and also two pions in I = 2. In this issue I will document the progress towards getting them.

There is a second Wolfram Language driver script, some common things are refactored into the library. The job script generator is extended to support multiple channels. Right now the directory structure in the working directory only supports one channel at a time. I think that this is okay, this way one can have multiple working directories and just symlink the configurations into there. For my project I would need two directories per ensemble, which I find okay.

The job script generator also generates a job submission script that is grouped by total momentum and irrep. This way one can just copy a few stanzas and do only a few jobs.

Using this I have started analytic projection jobs with P² = 0 and 1. The results are either that they don't couple (somewhat empty JSON file) or a $Failed. I need to look at this in the Mathematica notebook again to figure it out.

Additional we want other momentum cutoffs here, so I need to expose that in the library and not (only) use the ones hard-coded into the contraction code as a base.

martin-ueding commented 5 years ago

The analytic projection code requests the following correlator:

C4cC_uuuu_p-100.d000.g5_p-100.d000.g5_p000.d000.g5_p200.d000.g5

The momenta are -100, -100, 000, 200; so it fulfills all the rules set out for the rho, but not for the three pion project. There the largest single momentum maxnitude must be 1. We therefore need a different cutoff for this.

For the time being I am going to cut the corner and just replace the previous hard 4 with a hard 1 and re-run the contractions. In the longer run one must be able to provide a cutoff function in the driver scripts which allows for this individually.

martin-ueding commented 5 years ago

I am not sure whether this is properly complete now, but I have the impression that the 2pi I=2 now works: corr_matrix_2pi_I2.pdf

martin-ueding commented 5 years ago

I have just done it for three pions. Two irreps were not finished yet, but a bunch of them were finished. This is what I get: corr_matrix_3pi_I3.pdf

It does looks pretty similar to the rho case, so this might actually be it?

What still irritates me is that the correlator matrix is not symmetric, but it has not been that for Markus's data either. But it should be symmetric, right?

martin-ueding commented 5 years ago

The 000 Eu is now finished as well, it looks like all the other ones:

Bildschirmfoto_002

The numeric projections only take few seconds per irrep and per configuration. Therefore I will change the job script generator such that it does all the irreps for each configuration in one job. For testing it would be nice to still have the possibility to run them individually, but that should be rather simple to do. Grouping by irreps and doing all configs would lead to very unequal loads in the jobs, that does not sound good for scheduling.

martin-ueding commented 5 years ago

I have produced the numerical projections on the whole of A40.24, merged the resulting JSON files and converted it into one handy Rdata file: corr_matrix_3pi_I3_A40.24.zip

It is a tibble (like data frame, just no unasked conversions) with total momentum, irrep and the correlator matrix, which is a two-level list with a matrix containing statistics (rows) and time slices (columns).

One can just take these matrices, create cf_orig objects from them, symmetrize, bootstrap, plot. I have done this ad-hoc for one of them, it looks reasonable at first sight. The correlator comes from the negative and then shows a cosh-like behavior:

Bildschirmfoto_003

I believe that this is a good point to stop further processing them here in this code base and hand over to the Lüscher Analysis code base.

martin-ueding commented 5 years ago

With Fernando we have discussed that we should rather do P² ≤ 4 in contrast to my timid P² ≤ 1 which I have done so far. So I just increased the cutoffs and found that the analytic projections would not finish within three days. Also the contractions ran into the walltime limit while setting up the kinematics! So clearly we have an explosion in the number of momentum combinations.

For the previous cutoff in the P² = 0 frame we only have these sets of relative momenta:

{{{0, 0, 0}, {0, 0, 0}}, {{0, 0, 0}, {0, 0, 1}}, {{0, 0, 1}, {1, 0, 0}}}

So those are three combinations that potentially go into the correlator matrix given that there is coupling. The actual contractions will have lots more momenta because we of all the lattice rotations applied afterwards.

We can compare the number of momentum combinations in the analytic code for the old and new cutoffs:

Old New
0 2 4
1 3 11
2 0 17
3 0 14
4 0 4

These number get multiplied by global rotations of the total momentum, these multipliers m are 0, 6, 12, 8, 6 for the various moving frames. And then we have the sum over all stabilizer elements, which should give factors of 48/m. Either way there will be many correlators to compute.

Unless the Mathematica code gets much faster, we only do the momentum combination set-up once and figure out a way to make the contractions much faster, I do not see a way of doing P² ≤ 4. But I also do not see which subset makes the most sense. For the analytic projection code I do the various moving frames individually. This means that in order to get that down I need to lower the cutoffs a bit. The contractions could be done in a similar way, though that would waste some intermediate results.

Although the contraction code does not support it right now I would think of doing a bit more for the two pion correlation functions (perhaps even same cutoffs as the rho) and then having a slightly extended set for the three pions.

What would be a good thing to try out, @urbach and @kostrzewa?

kostrzewa commented 5 years ago

Also the contractions ran into the walltime limit while setting up the kinematics!

So, for a single three-particle operator, all possible momentum combinations including all p_i^2 <= 4 are around 36k. This should be instantaneous and should only be done once. These easily split into different total momentum configurations which should then be truncated (as of course for p_i^2 <= 4 you will have lots of P^2 = \sum_j [\sum_i p^j_i]^2 >> 4). Finally, the sink and source combination of this final filtered set should again just be an O(100k) problem at most!

Unless the Mathematica code gets much faster

I also don't really understand why this is so slow, although the problem is of course worse than for the contraction code (many irreps)

kostrzewa commented 5 years ago

Also the contractions ran into the walltime limit while setting up the kinematics!

So, for a single three-particle operator, all possible momentum combinations including all p_i^2 <= 4 are around 36k. This should be instantaneous and should only be done once. These easily split into different total momentum configurations which should then be truncated (as of course for p_i^2 <= 4 you will have lots of P^2 = \sum_j [\sum_i p^j_i]^2 >> 4). Finally, the sink and source combination of this final filtered set should again just be an O(100k) problem at most!

Just to add to this. Do I understand correctly that in the run of the contraction code including C6cC with p_i^2 <= x, P^2 <= 4, the 1.5 days after which I killed the job (thinking that it had simply locked up), all this time was spent in init_lookup_tables and most likely in build_quantum_numbers_from_correlator_list ?

I think I'm beginning to understand why this is so slow as it appears to me that the combinatoric problem is solved over and over and over again...

martin-ueding commented 5 years ago

In my analytic projection code I indeed construct and filter the momenta for one operator only and then build the cross product with itself. There is no need for final filtering. I construct all the momenta that give a certain total momentum. And I just use this set on both source and sink. Momentum conservation is automatically conserved.

The job that you killed indeed was stuck doing the kinematic combinatorics.

I have not touched the algorithm of that part in the code. I have refactored it, but given all the checks that skip things, it seems like the outer product of all single particle momenta involved in a diagram and then throwing them out. Just as inefficient as my tr(Q₂ Q₀ Q₂ Q₀ Q₂ Q₀) which did the whole product even if only one of the terms had changed.

For the contraction code I sense that this is a decision point. We could rework the combinatorics to make them more efficient. Alternatively we could just read a list of needed correlators from the analytic projection code. I think that it belongs there because the logic is redundant and I tried to incorporate the cutoffs from this code into the analytic projection code, which feels a bit backwards.

So I would think that the Mathematica code should become faster and the change to the contraction code should just be that it populates the “lookup tables” from the prescriptions.

I also don't really understand why this is so slow, although the problem is of course worse than for the contraction code (many irreps)

I have the hunch that the ReplaceAll takes ages. But it does not get worse because of all the irreps. Each irrep runs in an independent job on a single core. But even with that it takes more than 3 days for a shocking number or irreps. There are some which don't couple and therefore are done in time.

So how much do we feel possible to compute? I would like to avoid spending a lot of effort optimizing some parts of the code only to realize that we cannot afford the needed contractions in the end anyway.

marcuspetschlies commented 5 years ago

Hi Martin, what exactly do you mean by a "shocking number or irreps" ?

martin-ueding commented 5 years ago

With two pions at I = 2, the P² = 0, T1u irrep takes quite a while to process due to the stabilizer group having 48 elements. This means that there are more summands to process and therefore this analytic projection takes the longest. Other irreps are much faster, usually taking half an hour and not days.

But for three pions at I = 3 I see that virtually all interesting irreps run into the walltime limit of three days. If it it were just the P² = 0 irreps I could understand, but seeing that even the higher moving frames fail was shocking to me because I did not except it to become that bad.

marcuspetschlies commented 5 years ago

Okay, so I assume / understood, that the projection for some individual 3pi interpolator itself is fast and not an issue, right? So the formula Lambda ( pi pi pi ) = ... can easily be produced for any choice of momentum config, any irrep, any row ( if non-zero ). What takes time is the translation into the prescription for which data to read for a 2pt. I assume you are eliminating potential redundancies ? I have not done it on paper yet ( I'll do, out of interest ); but e.g. for some pi(p) pi(-p) pi(0) the outcome will be the same, irrespective of which pi carries p or -p ?

marcuspetschlies commented 5 years ago

(dep. on diagram type)

Also rotations/reflection may leave momentum configs unchanged; maybe that could be reduced?

marcuspetschlies commented 5 years ago

Would it be possible to have the output step-by-step ? for given momentum config p_1,2,3 (1) Lambda,r ( 3pi ) operator projection plus reduction of operator ( without redundancies; that would amount to isospin + Bose symmetry ) Would already be interesting to have a listing of how many terms finally in each case. That would be a good point for a cross-check.

And secondly for choice of two mom configs , there will always be the same set of 36 diagrams (2) list of diags plus reduction of diagram list ( again without redundancies )

(In case that already exists, could you point me to it?)

Thanks.