Closed IvanNikolic21 closed 6 months ago
This pull request is still on-going. There are issues with hdf5 saving that I'm currently investigating.
The culprit is definitely Lyman-alpha luminosity!
Another problem occurred: saving cannot be done in parallel for the same file. One possible solution that might take some coding time is to again create separate files by accessing the file in read mode. I can just copy the file with the name that includes main bubble information, and then add stuff inside that's particular for that file.
I'm still battling with memory issues. I've decided to set some parameters to more manageable values. I'm checking how this will go.
This bug is taking forever to solve, mainly because my solving strategy is completely wrong. I'm waiting for the whole code to finish before detecting a certain bug. This leads to one debug per 6 hours, which is definitely inefficient. I believe I need to simplify the problem, either by reducing the number of iterations, or galaxies or something else.
Current strategy is to use whatever forward-models I can that are behaving well. Other than that I've done some better printing that will help me find the culprit.
I should also improve my variable names. For example 'like_on_flux' is a boolean variable, being False if likelihood is not performed on the flux, but it's also a flux variable in the other case.
Memory is good now, I just need to do all of the samples, though it's not the code now that is at fault. Therefore I can merge this pull request
More details about the commit here: There was a problem with memory usage which ended on of the jobs. The main culprit is the memory usage of the container (I believe) so instead of filling the container with everything, I decided to save what needs to be save in the cache beforehand and saving only the final products afterwards.