21cmfast / 21cmFAST

Official repository for 21cmFAST: a code for generating fast simulations of the cosmological 21cm signal
MIT License
58 stars 37 forks source link

[Feature Req.] Non-cubic coeval boxes #275

Closed steven-murray closed 1 year ago

steven-murray commented 2 years ago

Is your feature request related to a problem? Please describe.

When making lightcones, almost always the resulting lightcone is longer along the LoS than the other two dimensions. This results in having to re-use the (evolved) coeval box multiple times, which then has periodicity and screws up some statistics. It would be good to try to avoid these spurious statistics as much as possible, or at least push them to larger and larger scales.

Describe the solution you'd like

I don't know how possible/feasible this is, but it might be useful to allow the coeval boxes to be non-cubic. That way, we could have them be longer along one axis (the LoS), which should provide for more smoothly evolving lightcones

My worry is that the speed of FFTs might be decreased for non-cubic boxes. My hope is that even so, doing N*N*4N might still be faster than 4N*4N*4N.

Describe alternatives you've considered

As far as I can tell, the only real alternative, if we wish to preserve continuity across the lightcone, is what we are currently doing. We can advise users just to make BOX_LEN larger if they need to, but of course this comes at the cost of either performance or accuracy on small scales.

BradGreig commented 2 years ago

I probably have the basic framework to develop this following my extremely large-volume simulations, since that requires breaking up the FFTs into successive 1D and 2D transforms. Not that that is how you would tackle this, more the logic in doing it etc.

However, I am not sure how useful this might be. I guess it depends on the specific question at hand. I can see cases where this might be useful, and certainly cases where this isn't helpful.

My initial thought would be that the FFTs will likely be slower (but not too bad). However, in most cases the FFTs are not the dominant time sink anyway.

I guess at some level if this is done, we are trusting the user to analyse the statistics of the signal properly. For example, need to be careful when spherically averaging if the LoS dimension is larger than the transverse direction.

BradGreig commented 2 years ago

Hey @steven-murray, I had a little bit more of a think about this.

Firstly, I'm pretty sure this is fairly trivial to implement. Though you'd need to come up with a clean way to minimise the number of variables visible to the user (otherwise you have two DIMs, HII_DIMs etc.).

However, I'm not sure doing this actually makes sense. Mostly because the statistics of the fields will not be correct. Extending the line-of-sight direction means adding larger scale modes along the line-of-sight direction, but those scales will be absent from the transverse directions. In effect this means that the fields can only be used on scales where all modes are present. That is, on the smallest dimension (there the 3D modes are correct). Thus you would be extending along the line-of-sight, but then those cannot really be used for anything anyway.

Does that make sense?

steven-murray commented 2 years ago

Thanks @BradGreig, yes I get your point. However, I think there would be some good use-cases. Eg. when plotting the xHI as a function of redshift (or global Tb(z)), you get spurious patterns related to the periodic boxes. Now, technically, this just means also that you shouldn't use scales beyond the transverse box size.. i.e don't explicitly trust the comparison of Tb(z) and Tb(z+boxsize). However, in this case it is very tempting to trust these scales. You could also smooth the function, which can help, but it's unclear exactly what this does to the statistics.

In the same way, looking at eg. wavelets, you can say "just trust scales smaller than the box size" but it's not always clear how to separate these scales properly from the other scales in this case.

I guess the point is, it makes sense to always use cubic boxes for power spectra, but maybe not for other statistics that mix up different scales inherently. Even then, you'd have to be "careful" because of evolution over the line-of-sight, but that should go without saying for these kinds of statistics anyway.

BradGreig commented 2 years ago

Hey @steven-murray, I don't think it ever really makes sense to consider non-cubic boxes (except for perhaps aesthetic reasons). As soon as you consider a non-cubic box, technically speaking it's region of validity should only be a cubic box of size equivalent to the smallest dimension (where all modes have been included). Because, otherwise its statistics will not behave correctly (it will have been generated without the correct large scale modes along at least one of the directions). Thus, you will be potentially biasing your conclusion.

Thus, I would argue that strictly speaking it should always remain cubic, and if mode mixing is a concern on certain scales then unfortunately you'd have to just increase the box size/dimensions.

That being said, perhaps it might be a reasonable approximation to make (provided the longest dimension isn't too large). Say for example, if the line-of-sight direction was only twice that of the transverse direction, then perhaps the statistics aren't too severely affected (that is, may be within the noise) allowing for such an approach to be taken.

I feel like this is something that would need to be explored/tested prior to advocating its usage though. Which may not be the most interesting thing to consider.

In the next week or so I'll create a branch with this functionality (maybe tomorrow as I'll likely be bored of writing), but I'll leave it to you (or someone else) to test its validity. It shouldn't take too much time/effort to add.

steven-murray commented 2 years ago

Thanks @BradGreig. Yes perhaps you're right. I guess we can do some tests where we compare different statistics from different box sizes. It might make a semi-interesting very short paper.

BradGreig commented 2 years ago

Hey @steven-murray I created the branch non-cubic to look into this. It's a mostly working version, with only couple of things that are incomplete (irrelevant things for now). Feel free to have a look or play around with it if you choose to. I haven't added any tests or modified existing tests, so it will fail CI. Also, I haven't gone through it in great detail, thus it is possible there could be some minor bugs etc.

In LightCone_new.pdf I show 3 example light-cones using this branch. First, is a DIM=150, HII_DIM=50, BOX_LEN=100 light-cone with spin temperature and recombinations etc. Second and third are with NON_CUBIC_FACTOR = 2 and 4, which scales up the line-of-sight direction (i.e. NON_CUBIC_FACTOR = 2 means DIM=300, HII_DIM=100, BOX_LEN=200 along the line-of-sight direction while the transverse directions remain at DIM=150, HII_DIM=50, BOX_LEN=100).

With this, you can clearly see the repeating structures disappearing. No idea how the statistics, or other quantities behave. But, it looks to be reasonable and demonstrates the main point.

steven-murray commented 2 years ago

This is awesome, thanks @BradGreig!