Graphing and reporting work flows could be improved

HamishBrownPFR commented 1 year ago

Describe the new feature

I have been doing some development on the wheat model and wheat.apsimx is now a mess of reports spitting out mostly redundant variables and graphs that were time consuming to set up and are often broken. Some ideas on how we could improve things.

There are two types of graph that are most useful and we may be justified on creating some reporting and graphing objects that serve these two types well but don't provide the full level of flexibility that is present in the current graphing and reporting objects.

Observed vs Predicted scatter plots for quantifying overall model performance. These are needed for the model performance testing system and are very instructive of the overall accuracy of a model. Currently we have to make sure we are reporting all the variables we want to compare to observations and then make each graph. We tend to report much more that what is being compared to observations. I suggest we make a Performance Stats report object. This would have a reporting functionality built into the background that pools the observed report files and creates a simulation report that has all the variables that have a matching observation but only for simulations and dates that have observed values. This would reduce the amount of redundant reporting on the build. In the .apsimx file this object could have a view object where it creates the typical obs vs pred graph for all the simulations in scope with a drop down to choose which variable to display. In the build we would render a graph for each variable as we currenlty do.

The other graph type that I use all the time for testing and debugging is a scatter plot with clock.today as the x axis and one or more other variables as the y axis. These graphs may or may not have observations plotted as well. I think we could make a time coarse object that served this type of graph more efficiently by addressing the functinality below.

Adding Observations onto graphs is cumbersome but is normally wanted if the observations exist. Can we just build the graphing of observations into the background of the proposed time coarse graph so it finds and displays observations if they are present.
The graphing approach gives lots of flexibility which is good but in most cases it is not necessary and creates the need for lots of clicking to get a graph working. For the proposed time coarse graph, we introduce some defaults to speed things up. In most cases if we draw a scatter plot we would want lines for simulated values and symbols for observations. This could be coded as a default for this graph type. Most the time we don't care which factor is used for demarking color and which is used for demarking line or symbol type. Could we have it so by default the top level factor is used to demark color, and the second factor level is used to demark marker or line type.
The filter expression is powerful but often trips you up when you copy a graph from one experiment to another. Could we make it so we have the filter expression on a seperate object and the time coarse graph applies a filter experssion if it finds one in scope. That way we could copy a folder of graphs and just change the expression filter once to get all the graphs working in the new location.
When debugging we tend to add a great stack of outputs to reports that we may or may not need and end up reporting lots of stuff that we never look at. Could we turn it around buy having a report object that pools all the time coarse graphs in the .apsimx file and reports only the variables they are needing only for the simulations that are needing them. I imagine the time coarse graph object could would require clock.today to be reported by default and integrate the current functionality in the report view where we can use intellisense to pick y variables for the graph. We would need to set up the graph then run the simulations to get lines on it. This would disperse the reporting functionality over many graphs but would make it easier to ensure each graph has its intended variables reported and only report the variables that are needed for making graphs.

I propose we try something like this in addition to the current reporting/graphing capability as it would expediated model development and facilitate a tidy up of the test.apsimx files. We could then look to harmonize it with the existing approach.

jbrider commented 1 year ago

Anything to make reports easier to use would be great - particularly with respect to filters.

I'm not clear on what the scatter plot graph might look like. I suspect it is something similar to the 9 panel graphs we use. Yes, configuring it is awkward, but it's usually a one off process in most cases.

jbrider commented 1 year ago

I like the idea of the reporting being more intelligent - but it shouldn't cause the core runtime to be slower, so it should only be added as an option into the process, not the default.

sme016 commented 1 month ago

When there is a need to redraw graphs in other software, it would be very useful if the x-y values of all the observations and predictions (as well as the current 'Copy graph to clipboard') in a single graph could be copied as well into an Excel format, ready to produce a graph in Excel. Each compnonent of the legend could be separate x and y columns. The current alternative is that one needs to redo all filtering and data collation in Excel, for example, before a better graph can be produced, which I find veyr tedious and time-consuming.

APSIMInitiative / ApsimX

Graphing and reporting work flows could be improved #8413

Describe the new feature