UCL / TLOmodel

Epidemiology modelling framework for the Thanzi la Onse project
https://www.tlomodel.org/
MIT License
11 stars 5 forks source link

Parameter Table #1419

Closed tbhallett closed 2 weeks ago

tbhallett commented 2 months ago

For our overview paper (and also generally, I think), we need to have in one place an assembly of all the parameters that are being used in the model*. This would be a "table" (but see below) of the form:

Module Parameter Name Parameter Description Value

... where 'Parameter Description' is the description that is given when the Parameter is declared by each module.

Of course, the 'Value' for each parameter can be a something simple to put into a table (Types.BOOL, Types.REAL, and even Types.LIST and Types.DICT), but several of these parameters are of type pandas.DataFrame and pandas.Series and I think these would need to be put into another little table.

For the immediate use, ideally this would table would be rendered as one single pdf file, wherein sub-tables are linked-to by within-document hyperlinks. If that's hard, then any other format single file format (e.g. html, excel etc) would work.

I was thinking this might be quite a similar job to what @matt-graham has done already to render the Resource Files on the tlomodel.org website.

(As an aside - and just to link the issues for information -- this seems to me slightly related to https://github.com/UCL/TLOmodel/issues/1337 --- as this would avoid us having parameters of data frames which some associated detritus (columns not used, arbitrary annotations etc.)).

*The parameter values actually being used in a specific run of the model -- specifically, those that exist at the start of the simulation in this specification of the model.

matt-graham commented 2 months ago

Hi @tbhallett, just wanted to check a few things:

On the last point, in the longer term I guess we might also want to add some automated checks around parameters and do some clean-up of parameter definitions, as there are also quite a few parameters with specified type_ which does not match there actual type.

tbhallett commented 2 months ago

Is it only the modules returned by tlo.methods.fullmodel.fullmodel we should be listing parameters of?

Yes. (and the values of those parameters following being updated in that Scenario file I linked to; basically the edits are done by get_parameters_for_status_quo)

As dictionary / list / series parameters require quite a wide column to display their value, I'm thinking to have the value of all parameters of non-scalar types defined in separate tables linked to from main table?

That would be fine with me.

As there around 1600 in parameters in total, shall I create per-module tables rather than including module name as a column?

Yes, that sounds good. But, I'd like to be in one (ridiculously long) document, if possible.

Some parameters defined in Module.PARAMETERS (class) attribute do not appear to have values defined in Module.parameters (instance) attribute after read_parameters method is called (for example init_or_higher_bmi_per_higherwealth in Demograpny). Shall I just omit any such parameters? On the last point, in the longer term I guess we might also want to add some automated checks around parameters and do some clean-up of parameter definitions, as there are also quite a few parameters with specified type which does not match there actual type.

Yes, absolutely. I raised this issue a while back thinking the same: https://github.com/UCL/TLOmodel/issues/109 wow - really!?! Yes, please to omiting them. (And, let's raise an issue to get that cleaned up)

matt-graham commented 2 months ago

@tbhallett So I have something working for the parameter table but it's going to create a very big file - the Markdown source is 240MB and almost 2 million lines long, so I imagine it will result in a very large PDF document. This is due to quite a few very large dataframes being stored as parameters. I would assume a file of this size will just create problems when trying to distribute it, so I guess we will probably want to shorten somehow. Most obvious way would be to just remove all dataframe parameters above a certain size (maybe 10 rows?) or just include a summary representation?

tbhallett commented 2 months ago

Yes that sounds reasonable.

It might be that some of these data frames are not really "parameters" in the sense we want to mean them. So, perhaps when we see the report, we could add in some skipping rules for specific parameters.