Pyomo / pyomo

An object-oriented algebraic modeling language in Python for structured optimization problems.
https://www.pyomo.org
Other
1.96k stars 504 forks source link

Automatic generation of model documentation #752

Closed GiorgioBalestrieri closed 8 months ago

GiorgioBalestrieri commented 5 years ago

(not sure how to flag this as a feature request)

It would be a great, great feature to be able to automatically generate model documentation. I'm thinking of something like GAMS Model2Tex command, possibly through a Sphinx-like tool.

Clearly, the fact that Pyomo models can include arbitrary Python code will make quite difficult to make a tool that will always work, but I think using a combination of docstrings, doc fields and user-inputs should work.

I tend to think it would be somewhat easier to generate documentation from a ConcreteModel than an AbstractModel, but I would be happy to be corrected.

If anyone with a better knowledge of the inner workings of Pyomo would mind elaborating on some possible strategies and challenges in doing this, it would be helpful.

A non fully automated way - but very effective - is to include LaTeX formulations in docstring, and then rely on Sphix to generate the documentation, as done for the (awesome) Calliope project. Not sure how this would work for Constraints defined through lambda functions or expressions though. The mathematical formulation is here, and an example of docstring is here.

blnicho commented 5 years ago

We do have a prototype for something like this in another project built on top of Pyomo. We may be able to generalize that functionality for pure Pyomo models to support what you're asking for. I don't have an estimate for when that might happen though.

GiorgioBalestrieri commented 5 years ago

Ok, I gave it a try. It's still quite hacky, it basically reads the doc field of all components in a model and builds reStructuredText based on that.

See an example here

fleimgruber commented 5 years ago

When I gave the thumbs up, I obviously misread the OP. I thought that since Pyomo Params and Vars objects contain their indexing specifics they could be derived automatically and the LaTeX could be built from that in a generic way. Same goes for the constraints using the Pyomo internal expression parser results. I had a chat with @jsiirola during a conference where he mentioned that this could be pulled off.

@GiorgioBalestrieri I do not mean to hijack your feature request, but I thought the discussion might fit here. Would you be interested in that as well? Your approach is very reasonable if you want maximum flexibility, but requires seperate maintenance of the LaTeX "docstrings".

GiorgioBalestrieri commented 5 years ago

@fleimgruber that sounds absolutely reasonable. I see using the components docstrings as a step forward compared to maintaining a separate documentation where one still has to write all the latex code in a separate location. I think it makes much easier to ensure that the mathematical formulation and model implementation are synced, and potentially adds clarity to the model itself by coupling a mathematical formulation to each component.

As a side note, I tried to include information from the object itself such as domain, index, default values (basically most things that are displayed by pprint).

I think it's far from a perfect approach, as it does risk to clutter the code a bit and it requires some extra caution in the way the mathematical formulation is written (escaping backslashes etc.), plus of course it still relies on someone manually writing that mathematical formulation.

If this could be generated automatically (without leveraging the docstrings, I mean) it would obviously save a lot of time and reduce the risk or errors, so I would be very interested in that. I'm honestly not an an expert in the inner workings of Pyomo, and in particular in how and when expressions are parsed.

I see two main challenges with trying to get this done through the expression parser:

Again, I have a fairly limited understanding of the inner workings of Pyomo, and I would be thrilled to see something more automated becoming available. I'll definitely leave the Issue open for now, and wait for @jsiirola to chime in.

In the meanwhile, feel free to have a look at how the docstring-based approach works and recommend any changes, it's extremely rudimentary for now and very biased towards the way I tend to use Pyomo, so there is certainly much space for improvement.

jsiirola commented 5 years ago

This is an often-requested feature (I was surprised that there wasn't already an issue for it), and something that I think all of us would want (but no one has had time to dig into). I gave a quick look over the docstring approach, and there are some really neat ideas in there. Thank you!

To answer some of @GiorgioBalestrieri's comments:

Now, if you can live with the form of the expression that exists after the rule has fired, then you can convert the expression into a form more amenable to documentation in a relatively straightforward way by walking over the expression tree. The project that @blnicho references has had an initial public release, and you can see their approach here. Basically, they handle it by converting the expression to a sympy expression and then using sympy's LaTeX generator. This works particularly well for that project as it generally does not have expressions with large sums in it.

For more OR-like models (LP//MIP), where large sums are the norm, I think there is an 80+% solution that we could put together. This would rely on undocumented Pyomo features and some pretty fundamental (i.e., low-level) changes to Pyomo to pull off, though. The short "design summary" is:

Now for the problems...

This is where something like the docstring approach will fill in nicely: for anything where the automated approach fails, or generates the "wrong" documentation, then the modeler can override it by explicitly providing the documentation through the docstring. I also like the convention of using :math:`x` for providing the "LaTeX name" for the Pyomo object through its doc field (as my models always avoid the use of single-letter variables that papers encourage).

GiorgioBalestrieri commented 5 years ago

@jsiirola thanks for all the information. As mentioned above, I think the docstring-based approach is far from perfect, but it offers an incremental improvement over maintaining a model and the related docs in two separate files.

Anything more automated would be awesome, but I don't think I am familiar enough with Pyomo's core or have enough experience with different ways people formulate Pyomo models to come up with something general enough.

If any effort goes in this direction, I'd be glad to test it and provide some feedback.

As a side note, I think that quite often the best way to express the mathematical formulation for a constraint, expression or objective is different from the way one would formulate it in terms of code, so including the formulation in the docstring might improve the clarity of the model documentation.

l-kotzur commented 1 year ago

Would it help to combine the expression template system with latexify?

mrmundt commented 1 year ago

@codykarcher - this issue!

mrmundt commented 8 months ago

There is an initial implementation of a LaTeX printer now in pyomo.contrib. The author has another issue (#3048) tracking requested changes, bugs, etc.