N3PDF / pycompressor

Compression code for PDF replicas.
https://n3pdf.github.io/pycompressor/
GNU General Public License v3.0
1 stars 0 forks source link

GA performing bad #28

Closed Radonirinaunimi closed 3 years ago

Radonirinaunimi commented 3 years ago

The GA (or the way in which the GA has been parametrized) is performing bad when the samples are drawn from a larger enhanced set. This was not noticed previously when the prior contained 75 MC replicas and the GANs generated 25 synthetic replicas. This is a combinatorial issue.

As an illustration, I took a prior with 100 MC replicas and generated 900 synthetics (the enhanced set is of the total of 1000 replicas), the final ERF values when the compressed samples (compressed to 80 replicas) are drawn from either the prior or the enhanced are given in the table below: Compressed Samples Prior Enhanced
TOT ERF 0.172 0.453

The total ERF value from the enhanced should be at most equal to 0.172. This indicates the GA is trapped in some local minimum.

scarrazza commented 3 years ago

The total ERF value from the enhanced should be at most equal to 0.172. This indicates the GA is trapped in some local minimum.

Or that the GAN quality is insufficient, right?

Radonirinaunimi commented 3 years ago

Or that the GAN quality is insufficient, right?

I don't think this is the case here since I repeated the exercise with real fit instead of synthetic, and similar handicaps arise. Also, if that is the case, the GA should always be able to find the exact same combination as the ones from the prior.

Radonirinaunimi commented 3 years ago

@scarrazza Since I guess that we will be using a N3FIT (or potential candidate NNPDF4.0) fit with 1000 MC replicas for the paper, should I start generating the fit? If so, which runcard should I use? I am currently using the one that Lambri generated but that is not what we are going to use, right?

scarlehoff commented 3 years ago

I think it is fine for now to use the Lambri one, partly because Galileo is down... once it is back (and the candidate 4.0 is completed) we can try to run a 1000rep one.

Radonirinaunimi commented 3 years ago

I think it is fine for now to use the Lambri one, partly because Galileo is down... once it is back (and the candidate 4.0 is completed) we can try to run a 1000rep one.

Thanks! I was hoping that this could be run in the meantime. But you are definitely correct, since Galileo is still down, we would have no option than to wait.

scarlehoff commented 3 years ago

Just came back! I'll put 1000 replicas to run today so they might be done by Monday, since right now there is nothing from NNPDF to run.

Radonirinaunimi commented 3 years ago

Just came back! I'll put 1000 replicas to run today so they might be done by Monday, since right now there is nothing from NNPDF to run.

Thanks! That would be awesome :100: