Need to begin integration of lcopt to new implementation of bw2parameters

pjamesjoyce commented 6 years ago

TLDR version - skip to suggested new version section

Lcopt was originally written when bw2parameters was at version 0.5.3 and bw2data was at version 2.4.4.

There have been major changes to bw2parameters and bw2data since then, explained here.

In order to keep lcopt compatible with brightway, the parameter implementation in lcopt needs to be migrated to the new parameter paradigm. This has been raised by the folks over at the bw2 activity browser (LCA-ActivityBrowser/activity-browser/issues/153, @bsteubing, @haasad, @cmutel, @cardosan) and in the freshly minted brightway-dev mailing list.

How it's done now

Lcopt's implicit modelling conventions are that:

all exchanges in an lcopt model are within the model database¹
no exchanges are 'hard coded', every exchange has a parameter, with an id in the format p[from][to].
in the development version the output amount from unit processes is also no longer hard coded to 1 - it is now a parameter in the format p_[from]_production
in addition to the exchange parameters, lcopt has global parameters which can be used in formulas.
exchange parameters can given a value or a formula
in order to run a calculation you pass a parameter set to lcopt which contains values for all of the parameters which aren't calculated by formulas
you can pass lcopt as many parameter sets as you like and it will calculate the LCAs for each of your process options (hence the now rather clunky name...)

The current version borrows some utility functions from (the old version of) bw2parameters in order to precalculate all of the necessary parameters before passing the model into brightway (this used to be the recommendation).

When lcopt sends a model to brightway for calculation it does this:

write the lcopt model into Brightway format with all exchange amounts set to 1, and an extra field parameter_hook in each exchange set to the parameter id associated with that exchange
for each parameter set:
- loop through all of the exchanges in the brightway model and set the amount equal to the value stored at parameter_hook in the parameter set
- run the lca, lci, lcia
- store the results
- next parameter set

How it could be done

Brightway's new parameter system looks like this:

My initial thought is that lcopt's current parameterisation setup is relatively simple and constrained compared to what brightway can do, and as such, only database level parameters within the model database are required:

The exchange and production parameters are uniquely named by lcopt, so may as well be stored at the database level
Non-exchange parameters are all defined globally within the UI == database level

Suggested new version

So the new version would need to look something like this:

When lcopt writes the exchanges to Brightway, it would write the exchange parameters directly into formula rather than adding its own parameter_hook; something equivalent to this:
```
a.new_exchange(amount=0, input=b, type="technosphere", formula="p_b_a").save()
```

parameter sets could then be passed into brightway in the native format, e.g.:

database_data = [{
'name': 'p_b_a',
'formula': '(foo + bar** 2) / 5',
}, 
{
'name': 'foo',
'amount': 12
}{
'name': 'bar',
'amount': 5
}]
parameters.new_database_parameters(database_data, "LCOPT_Project_database", overwrite=True)

we can then calculate the parameter set, run the lca/lci/lcia and store the results as before

@cmutel: do you think this will work? any further suggestions?

also, is there a more efficient way to switch parameter sets within Brightway than the rewrite, recalcute, redo lca/lci/lcia approach?

Footnote

¹ Exchanges with technosphere databases (ecoinvent, FORWAST) and biosphere databases (biosphere3) are designated 'external links' which are only parsed when the database is handed over to brightway. In brightway they look like this: [Actual activity in external database] -- 1 --> [lcopt representation of activity (x)] -- p_x_y --> [lcopt activity y] e.g. ['market for electricity, medium voltage {CH}'] -- 1--> ['Electricity'] -- p_x_y --> ['Thing that needs electricity']

cmutel commented 6 years ago

Thanks for building and continuing to work on this @pjamesjoyce! You are a big reason that I got off my butt and really got parameters integrated and functioning after all this time. Sorry that the current parameter-handling documentation is not very good, I know this is a weakness.

A few short comments:

bw2parameters is still the library handling the evaluation of parameters; it was updated to allow Monte Carlo iteration directly (i.e. instead of evaluating 1000 times, you evaluate once with a (1000,) array. The code in bw2data just handles precedence logic, namespacing, storage, and automatic re-evaluation.
I think that the long-term solution is to have models and model variants expressed in actual code - parsing text strings is always going to be a major source of errors and frustration.
After some reflection, I think that project/database/activity hierarchy of parameters might be a bit too clever. It introduces significant complexity without a corresponding increase in functionality. Working only on the database level seems reasonable.

You are completely correct that using the bw2data machinery to evaluate many model options is not the most efficient. There are two basic options that I can see:

Keep the model evaluation internal to LCOpt, and write the processed arrays directly instead of going through a bw2data.Database (maybe you do this already). This would be very fast, but would make it harder to integrate LCOpt into e.g. the Activity Browser.
Use presamples. This use case is exactly what the presamples package is designed for. The basic idea is that you have a base case database, but then keep each variant (either a different design choice or Monte Carlo iteration) in a way that can be directly used in LCA calculations. The variations are defined and written to disk ahead of time. @PascalLesage and I have worked hard on making this library industrial-strength. The docs are still being worked on, but we can help you apply this if you want. There is some text and notebooks in the docs folder.

PascalLesage commented 6 years ago

I just wanted to say that restricting parameters to the database level is indeed a great solution, and I will use it in my current project. To simulate project-level parameters (which would only be the case when the same parameter values are to be used for parameters found in more than one database), presamples can be used to make sure the same values are used across each instance of the parameter, for every iteration.

pjamesjoyce commented 6 years ago

Thanks both @cmutel and @PascalLesage - your input is much appreciated!

I checked out presamples yesterday and it looks great! I experimented with using it to inject the pre-evaluated exchange parameters directly into the matrices instead of looping through the bw2data.Database and it's lightning fast - 108 LCA/LCI/LCIA calculations in about a second.

(Unfortunately having figured that out it turns out the rate limiting step in lcopts calculations is the recurse_tagged_database step for the tree/bullseye visualisations - but that's another story (#25).)

Current plan is to move the evaluation of the parameters over to bw2parameters, so that parameter related issues can be resolved in one place for lcopt and brightway (happy to contribute if and when they do btw).

I'll tell lcopt to store the parameters as brightway database parameters when it sends to model over to brightway - that way lcopt generated foreground models can be analysed/edited in more detail in brightway with all of the parameters intact in a sensible form - but when lcopt runs its own analyses it'll pre-evaluate, then inject with presamples for speed.

I agree that, in the long-term, specifying calculated parameters in actual code is far superior to parsing text strings. The client side validation in the formula generation bit of lcopt might be able to be adapted to write code instead of well formed strings while still keeping the coding bit away from the user. I'll have a proper think about that at some point...

cmutel commented 6 years ago

I am very impressed (and pleased!) that you were able to get it running so quickly!

I'll tell lcopt to store the parameters as brightway database parameters...

Sounds good to me.

I agree that, in the long-term, specifying calculated parameters...

I hope I didn't sound too negative, I think LCOpt is great, and a good solution for now. I just meant that slowly we should move people towards solutions that are testable, documented, etc.

pjamesjoyce commented 6 years ago

Thanks!

I hope I didn't sound too negative...

Not at all!

pjamesjoyce commented 6 years ago

First steps - subclass bw2parameters.parameter_set.ParameterSet in order to delegate evaluation tasks while leaving the lcopt side the same

pjamesjoyce commented 6 years ago

@cmutel I'm running into a problem getting brightway to calculate exchanges with only Database level parameters

As far as I can tell, in order 'activate' the exchanges and get brightway to recalculate the amount field from the formula you need to use parameters.add_exchanges_to_group, then ActivityParameter.recalculate_exchanges to update the exchanges. Is this correct?

The problem I'm coming across arises when I'm adding multiple activities with formulas containing only database parameters to a group ("all"):

for a in new_db:
    for e in a.exchanges():
        if e.get('formula'):
            bw2.parameters.add_exchanges_to_group("all", a)
            break

Because there are no activity level parameters specified, parameters.add_exchanges_to_group adds one called "__dummy__", then adds the exchanges to the ParameterizedExchange table. This works first time around, but then fails for the second activity because "__dummy__" already exists in the group "all".

I feel like I'm missing something...

@PascalLesage did you come across anything similar in your project?

As a temporary workaround I've edited my local version of bw2data/parameters.py to add a random number to the dummy parameter:

@staticmethod
def insert_dummy(group, activity):
    code, database = activity[1], activity[0]
    if not ActivityParameter.select().where(
        ActivityParameter.group == group,
        ActivityParameter.code == code,
        ActivityParameter.database == database,
    ).count():
        ActivityParameter.create(
            group=group,
            name="__dummy_{}__".format(randint(0,999999999)), #jj
            code=code,
            database=database,
            amount=0
        )

pjamesjoyce commented 6 years ago

First working version of lcopt with bw2parameters taking over the parameter evaluation is done - in the bw2parameters_integration branch for now.

This uses the above hack for dummy parameters for now.

It also turns out it's not possible to use presamples with bw2analyzer.recurse_tagged_database, so the amount field of each exchange is still updated from the pre-evaluated parameter sets, but using the formula key instead of parameter_hook.

@cmutel / @PascalLesage - I don't know if this is a bw2analyzer issue or a presamples issue. My feeling is it's to do with the recalculation of the lcia with a new demand in the recursive function in bw2analyzer, but I can't figure out why.

See this notebook for a demonstration of the problem - presamples tagged graph issue

pjamesjoyce commented 6 years ago

Working version merged into the development branch (#26).

You can now install/update the development version using

conda install -c conda-forge -c cmutel -c haasad -c pjamesjoyce lcopt-dev

#or 

conda update -c conda-forge -c cmutel -c haasad -c pjamesjoyce lcopt-dev

cmutel commented 6 years ago

@pjamesjoyce

Thanks, these are two nice bugs :)

Please report in bw2data and b2analyzer, respectively. I want you to get credit for reporting them!

Instead of adding a random number, I think we can just add the activity code. Groups aren't allowed to cross databases, so the code is guaranteed unique.

pjamesjoyce commented 6 years ago

Thanks @cmutel, sounds like a good solution to me.

I'll report those as issues in bw2data and bw2analyzer

PascalLesage commented 6 years ago

@pjamesjoyce I indeed have the same problem you do with __dummy__ - you are two days ahead of me! I'll use your workaround for now. Thanks!

pjamesjoyce commented 6 years ago

@PascalLesage Chris has fixed the __dummy__ problem in the latest version of bw2data (3.4.2).

Updating bw2data should fix the problem now.

I'll revert my workaround to using bw2data>=3.4.2 in lcopt-dev

pjamesjoyce / lcopt