Not all outputs seem to be making it into the index

MetOffice / CSET

Toolkit for evaluation and investigation of numerical models for weather and climate applications.

Apache License 2.0

10 stars 4 forks source link

Describe the bug

Not all output diagnostics seem to be making it into the index. I suspect this may be an index file locking issue, as they are written simultaneously, and maybe switching to an SQLite database will help.

How to reproduce

Steps to reproduce the behaviour:

Have a look in $DATADIR/2024/CSET-mega-run/plots. Observe that ls | grep domain_mean gives a much longer list than cat index.json | jq | grep domain_mean. We are missing about 30 plots!

Expected behaviour

All of the outputted diagnostics show up in the output.

I think the issue is that the index file is written concurrently for all of the diagnostics, and one is reading and then overwriting it at the same time as another, clobbering its changes.

In theory we avoid this by locking the file when writing to it, but it may be that as we are on a network filesystem the locking is a bit dodgy. Therefore I'm not sure SQLite would help here.

Probably the longer term fix for this is to have a single task write the index.json file at the end, though this will mean you can see plots as they are being made. Maybe we output the a mini index file to a unique name for each plot that we then combine, or maybe we just reconstruct from the directory names and the meta.json in each plot.

MetOffice / CSET