brightway-lca / brightway2-analyzer

BSD 3-Clause "New" or "Revised" License
7 stars 13 forks source link

Helper for labeled lca.inventory #12

Closed BenPortner closed 2 years ago

BenPortner commented 2 years ago

Calculating life cycle inventories in brightway is nice and fast but interpreting the data is hard because lca.inventory contains no metadata about the biosphere and technosphere activities. I added a helper function, which reads the metadata of all rows and columns and adds them to the inventory data in one pandas.DataFrame.

cmutel commented 2 years ago

Thanks @BenPortner.

Could you please give a quick overview of how you see this function versus the existing (broken) to dataframe method in LCA?

Is there a reason to use lca.inventory.todense() instead of only looking at the non-zero elements? Just simpler to implement?

BenPortner commented 2 years ago

Could you please give a quick overview of how you see this function versus the existing (broken) to dataframe method in LCA?

@cmutel Sure! First off all, I wasn't aware of LCA.to_dataframe. Because it is commented out, it did not appear in the function listing of my IDE. I checked it now and I see the following major difference: Whereas to_dataframe returns the characterized inventory, my function returns the uncharacterized inventory. I also don't fully understand how to_dataframe works. There is some funky magic going on with x-1 algebra and np.vstack, which does not seem intuitive to me. I feel like my implementation is more intuitive, but obviously I am biased ;) However, I would like to have my function in the bw2calc module itself. It would be nice to have a syntax like LCA.inventory.to_dataframe. The only reason I didn't do this is because I know you are trying to keep the bw modules separate.

Is there a reason to use lca.inventory.todense() instead of only looking at the non-zero elements? Just simpler to implement?

Well, the non-zero elements are necessary because the inventory matrix is wide format. Zero is the default fill value. Alternatively, one could fill the matrix with NaN's. This would save some memory but it would mean that the matrix will contain non-numeric elements. So in the end it's a memory vs. consistency trade-off. I chose consistency here because it makes the resulting DataFrame easier to handle. It also makes the function syntax shorter and easier to read (using nan instead of zero would require a custom implementation).

cmutel commented 2 years ago

@BenPortner Merged. However, note #13 #14 #15 and we should support this with an import guard in bw2calc.

I also don't understand the existing code there yet, that's why it is commented out and not ported to 2.5 syntax.