boutproject / boutdata

GNU Lesser General Public License v3.0
0 stars 2 forks source link

Redistribute can make incompatible files if nxpe not provided #102

Open mikekryjak opened 1 year ago

mikekryjak commented 1 year ago

Performing a redistribute without providing an nxpe fails in Hermes-3 with the following message, which took me quite a bit of time to work back from:

Error encountered: Value for option Nd+ cannot be converted to a Field3D

Since not providing a nxpe results in an automatic choice as the below comment, I think the failure is because the algorithm generated nxpe was not the same as what I have in the input file. https://github.com/boutproject/boutdata/blob/908a4c2a80a07d887f78667bb574f9c072c3c2fd/boutdata/restart.py#L559-L563

My suggestion for a fix is to print a message whenever nxpe is not provided:

Warning: nxpe not provided. This may create incompatible restart files if you set nxpe in your input file.

Alternatively we could parse the input file and check whether nxpe is set there. If it is and it is not provided to redistribute, we could raise an Exception. Or we could simply read the nxpe from the input file and pass it to redistribute.

Let me know what you think.

davedavemckay commented 1 year ago

Hi,

I'm also seeing the error Error encountered: Value for option n cannot be converted to a Field3D (but for a different variable) following a restart. However, in my case, nxpe is correctly read from the metadata as 32, and the split is 32 since nype is 1. In my case n has shape (1001,260,1,256) originally and (1,260,1,256) in the restart files.

@mikekryjak did you see any other potential causes of variables being incompatible with Field3D?

Thanks

mikekryjak commented 1 year ago

Hi @davedavemckay, sadly I think my issue was due to nxpe (at least I hope - it works for now!) and I don't have any other ideas. From your message, it seems like you have already checked that the individual restart files are correctly split?

davedavemckay commented 1 year ago

Hi @mikekryjak - no worries. They do seem to correctly split, in that I see 32 .nc files and loading them in with xbout.open_boutdataset produces an xarray object with 32 chunks. Although chunksize=10 and nx=260 seems odd, but I assume chunks vary in size. Thanks for the pointer - I'll keep digging.

mikekryjak commented 1 year ago

@davedavemckay what I was going to do to diagnose my issue further was to open each dataset individually and check its shape, and then cross-check this against what would be expected from the BOUT++ algorithm for all of them.

davedavemckay commented 1 year ago

Good idea - I'm just opening *.nc, so I'm looking at the aggregated shape.

dschwoerer commented 2 months ago

It looks like @davedavemckay's issue is due to the time dimension. If the restart files are really 4D, that sounds like a real bug.

For @mikekryjak issue, I think it would be best if BOUT++ would throw better error messages. Reading an input file to guess nxpe might be tricky, as it may be the old input file, and you want to change nxpe ... In which case, boutdata trying to be smart is not going to help ...

dschwoerer commented 2 months ago

@mikekryjak do you mind checking whether the current next version of BOUT++ still has this unhelpful error message?

Also, why do you provide nxpe? Generally, the choice of BOUT++ should be good for field-aligned grids ...

mikekryjak commented 1 month ago

@dschwoerer sorry I haven't got to this yet, I'm working against a deadline on a non-BOUT++ related paper at the moment. I will get to it.