Out-of-the box output formatting

hmgaudecker commented 4 years ago

Since my coauthors just complained that I sent them output tables with 1000+ rows this just occurs to me. Of course I have not checked whether such functionality exists already. In the end it should be pretty trivial, but it might save users a lot of boilerplate code that does little but reindexing/merging....

What comes immediately to mind...

Tables with standard errors:

Measurements: Separate by factor, sorted by time period and measurement by default. Columns would be constants, loadings, and standard deviations.
Initial factors: Means in first column, and sd/correlations in further columns (if using mixture, probably need also a more disaggregated version of it
The rest of the params / category values probably one table each ?
...

Some graphs:

Transition equations at mean factors / a few quantiles
...

Aside:

Would it make sense to split up delta into meas_constant (symmetric to meas_sd) and whatever might be relevant for the anchoring (I do not have that). Seems to be the last group that has a name that you can only possibly know if you have a sense of the notation.
And did we talk somewhere else about shock_variance -> shock_sd ?

janosg commented 4 years ago

No, we do not have that yet. Generating the individual tables should be really easy. Mostly just Series.unstack(). Then we also need corresponding functions that convert the DataFrames to nice tex tables with significance stars.

For the graphs, you could take a look at SkillModel.visualize_model(). This produces a tex file with many plots I found helpful to diagnose model specifications. Mainly heatmaps of measurements and coefficients and residual plots of an OLS Version of the model. You could also run it on simulated data at estimated parameters.

Yes, shock_variance -> shock_sd is on Mariam's to-do list.

We could definitely rename delta to controls. Control variables are not necessarily related to anchoring and it often makes a lot of sense to include them. Having intercepts and controls together makes the parameter parsing code simpler, but it would be manageable to split them.

hmgaudecker commented 4 years ago

Generating the individual tables should be really easy. Mostly just Series.unstack().

Yes, it is just annoying to look that up all the time :-)

Then we also need corresponding functions that convert the DataFrames to nice tex tables with significance stars.

There must be something out there for this; but yes.

For the graphs, you could take a look at SkillModel.visualize_model(). This produces a tex file with many plots I found helpful to diagnose model specifications. Mainly heatmaps of measurements and coefficients and residual plots of an OLS Version of the model.

Great, it is just running :-)

You could also run it on simulated data at estimated parameters.

Might be a helpful short-term thing, but in the end I want to visualise the actual results :-)

Yes, shock_variance -> shock_sd is on Mariam's to-do list.

Cool, thanks.

We could definitely rename delta to controls. Control variables are not necessarily related to anchoring and it often makes a lot of sense to include them. Having intercepts and controls together makes the parameter parsing code simpler, but it would be manageable to split them.

From a user perspective I would definitely think it is useful to have all things for measurements in a similar structure. E.g., I look at the descriptives for some measure (mean, sd) and I'd immediately see how it is reflected in the model (mean, sd, loading). For dedicated measurements this is obvious; for others it might be useful to allow slicing by measure rather than factor, too, for this very reason.

Why would the parsing be simpler? For the constants, you would even free up the name2 parameter again. For others, nothing changes?

hmgaudecker commented 2 years ago

For the graphs, you could take a look at SkillModel.visualize_model()

Horribly outdated, I know, but did that stuff disappear completely? Searching the skillmodels codebase for "heat", "visual" etc. does not turn up anything anymore...

hmgaudecker commented 1 week ago

Mostly done, pointless to keep the rest and search in it for leftovers.

OpenSourceEconomics / skillmodels

Out-of-the box output formatting #48