tanguy-marchand / wph_quijote

BSD 3-Clause "New" or "Revised" License
2 stars 2 forks source link

Advice for syntheses as in paper? #1

Open neyrinck opened 3 years ago

neyrinck commented 3 years ago

Hello, sorry to send this on the weekend; I am assuming you will look at it in the week if you want! I could use some guidance using the code. I have run run_syntheses.py, which produced one batch of 2 syntheses. Unfortunately, unless I am missing or misunderstanding something, they do not look nearly as good as the syntheses in the paper. It created a folder result/firstresult/1/, and created some synthesis***.png files there, but they look only slightly non-Gaussian, not at all like LSS fields. There are some fields in folder result/first_result/1/original/original-?.png that look like LSS fields, but I am guessing those are training data. I did try to increase the number of samples used for training(?) from 3 to 6 (changing the relevant line in run_syntheses.py to 'filepaths': [os.path.join('data', 'quijote_fiducial_log256', str(k), 'df\ z=0.npy') for k in range(6)], -- by default, the last argument is 3), without a substantial improvement. Does that need to go all the way to 30? My GPU ran out of memory at 9.

If you have any advice on how to get fields somewhat like the ones in the paper, e.g. setting parameters, I would very much appreciate it. An alternative is if my CUDA environment is inadequately set up. I have attached the output of run_syntheses.py, in case it helps at all.

Thank you!!

output.txt

Ttantto commented 3 years ago

Dear Mark,

Thank you for your message and your interest on the paper and the github repo.

The github repo was configured to only perform 3 steps of gradient descent (parameter "nb_iter" in run_syntheses.py), which is probably the reason why your syntheses do not converge.

I edited the github repo, and set the number of gradient descent at 100, which is what we used in the paper. During the gradient descent, the loss is plot at each step. In your case, the loss went from 1.8207e+14 at the first step to 1.0929e+14 at the last step. For numerical reason, the loss is multiplied by ("factr": 1e7) which is a parameter in run_syntheses.py.

If you want to increase the number of syntheses in the training data_set and face issue of memory, you need to increase the "number of chunk". Each chunk compute a small subset of WPH coefficient and their gradient. The bigger the number of chunk, the fewer WPH coefficients are computed per chunk. Make sure to have at least one coefficient per chunk otherwise the program might bug.

When you run run_syntheses.py, the output shows the number of coefficient and the number of coefficient per chunk. For information, I run all the syntheses for the paper with 16Go of GPU memory.

Please let me know, if you are still facing issues making the syntheses.

Best,

neyrinck commented 3 years ago

Thanks, Tanguy! It does work as I expected now; impressive!

I don't think this is really an issue any longer, but one question. I'm unsure about how the code uses the three "original" images; does it really base the syntheses on only 3 images? If so, even more impressive! If I wanted to increase the number of images it uses for synthesis beyond 3, would I just increase the number 3 in the following line or run_syntheses.py?

'filepaths': [os.path.join('data', 'quijote_fiducial_log_256', str(k), 'df_z=0.npy') for k in range(3)],

On Mon, Sep 14, 2020 at 2:45 AM Ttantto notifications@github.com wrote:

Reopened #1 https://github.com/Ttantto/wph_quijote/issues/1.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Ttantto/wph_quijote/issues/1#event-3762171741, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXQKHKHBHD52TCSBTK2PEDSFXJ4NANCNFSM4RKHBSZA .

Ttantto commented 3 years ago

Dear Mark,

In this case, the syntheses are based on 3 images indeed. You can increase this number by modifying the line you mentioned in your message.

Please be aware that due to the way the code is written, if you base your syntheses on k initials maps, it will generate batches of k syntheses. If you want to change this behavior you need to dig up a bit in the code.

We took k=30 in the papers. In that case, you need to inscrease the number of chunk in order to reduce the GPU memory consumptions.

Tanguy