Shape mismatch when pyrokinetics reads CGYRO output files when some fields haven't been saved

venkiteshayyar commented 1 month ago

We're facing an issue with pyrokinetics' reading of output files for non-linear CGYRO runs due to shape mismatch We have many runs where only the file bin.cgyro.kxky_phi is written, while bin.cgyro.kxky_apar and bin.cgyro.kxky_bpar are not. But the flux file bin.cgyro.ky_flux seems to have the information of all the fields. The mismatch is seen at this line : https://github.com/pyro-kinetics/pyrokinetics/blob/unstable/src/pyrokinetics/gk_code/gk_output.py#L600

data_vars['particle'][0][0],data_vars['particle'][1][0].shape
('field', (2, 8, 330))
dataset_coords['field'][0],dataset_coords['field'][1].shape
('field', (1,))

Hence the failure at the line pyro.load_gk_output() I've looked at the git issue here : https://github.com/pyro-kinetics/pyrokinetics/issues/333 and tried the suggestion : pyro.load_gk_output(load_fields=False) However, that seems to be causing another issue: "No variable named 'phi'. Variables on the dataset include ['particle', 'heat', 'momentum', 'species', 'flux', ..., 'kx', 'ky', 'energy', 'pitch', 'theta']" due to the line https://github.com/pyro-kinetics/pyrokinetics/blob/unstable/src/pyrokinetics/databases/imas.py#L710 This happens at

gkdict = pyro_to_imas_mapping(
            pyro,
            comment=f"Testing IMAS %s"%(gkcode),
            ids=gkdict
        )

For the future, we can insist on saving all the files, but to deal with existing runs one would have to re-run to generate the missing apar and bpar files.

Is there a way so that we can use existing runs with these missing field fields, without having the users to re-run? Perhaps bypassing the comparison ?

bpatel2107 commented 1 month ago

Ah that is a nifty little bug! The issue arises from where pyro determines the coordinate field. Currently it looks at the dimension of the Fields object which in your case only has phi. If the Fields object does not exist then it looks at the Coords object which does have all 3 fields and is determined from the input file.

This is the wrong way around so we should flip the check and use Coords if possible and back up to the Fields object which is probably unlikely to ever happen anyway. Will make a PR to fix this

venkiteshayyar commented 1 month ago

Tested it with the new branch with load_fields=True Again fails due to the line https://github.com/pyro-kinetics/pyrokinetics/blob/unstable/src/pyrokinetics/databases/imas.py#L710 but this time due to apar

"No variable named 'apar'. Variables on the dataset include ['phi', 'particle', 'heat', 'momentum', 'species', ..., 'kx', 'ky', 'energy', 'pitch', 'theta']"
  File "/<path>/anaconda3/envs/mgkdb2/lib/python3.9/site-packages/xarray/core/dataset.py", line 1474, in _construct_dataarray
    variable = self._variables[name]
KeyError: 'apar'

bpatel2107 commented 1 month ago

Okay so it loads in the output data but falls over when actually writing the IMAS file. I guess we can add a try except statement here and add whatever data is there.

I don't think we should just set it to 0 so leaving it empty seems best. Will have a crack tomorrow.

bpatel2107 commented 1 month ago

Added a check for each field and only adds it in if it exists. Can you check now? I was able to write out and read an IDS from a CGYRO run without the apar and bpar binary files so we should be okay now

venkiteshayyar commented 1 month ago

Yes, I can confirm that this works. Thank you !

venkiteshayyar commented 1 month ago

@bpatel2107 : A follow-up on this one. Exploring the IMAS structure for these non-linear runs, I find that the entries in the fields ['non_linear']['fields_4d']['phi_potential_perturbed_norm'] ['non_linear']['fields_4d']['a_field_parallel_perturbed_norm']

are completely Nans!

Could this be because of an issue with pyrokinetics not reading the data correctly ?

bpatel2107 commented 1 month ago

I was able to read in CGYRO data correctly and get the actual data out so I think the NaNs may appear upstream from the IMAS data dictionary generation.

Generally in pyro when something is all NaNs then there has been an issue with the conversion of the data from the codes normalisation to the IMAS normalisation. Can you share the input file for this case?

venkiteshayyar commented 1 month ago

Sure, attaching the files below :

test_IMAS_nonlinear_cgyro.zip

bpatel2107 commented 1 month ago

I can't see any major issues in the input file and using it I can convert to a different normalisation.

Are there any NaNs in the CGYRO output? This can happen if simulation "blows up". You could check this by looking at before you generate the IMAS file.

pyro.gk_output["phi"]

If you are able share the outputs then maybe I can debug further though it is probably quite large?

venkiteshayyar commented 1 month ago

I tried printing out pyro.gk_output['phi'] before the writing, and I find that it is an xarray with nans Looked through the output files, but didn't see any visible nans.

I'm attaching the output files, which are small. Also added most of the binary files that are reasonable (<10MB). The main binary files bin.cgyro.restart* are large (~ 300MB) , so can't attached them.

test_output_files_IMAS_nonlinear_cgyro.zip

bpatel2107 commented 1 month ago

Okay so the issue is due to a format change of the CGYRO output. The out.cgyro.equilibrium file slightly changed. Pyro needs to divide the fields by rho_star which is stored in this file. In the new format it was reading a 0.0 instead of the actual rho_star which led to the NaN.

A quick patch should fix this

ZedThree commented 1 month ago

Do the CGYRO files have version numbers in them?

bpatel2107 commented 4 weeks ago

The git commit is usually written in the beginning of the out.cgyro.run file so we could feasibly read in that information from there.

Maybe we can be smarter about which version is being used.

pyro-kinetics / pyrokinetics

Shape mismatch when pyrokinetics reads CGYRO output files when some fields haven't been saved #378