LCA-ActivityBrowser / activity-browser

GUI for brightway2
GNU Lesser General Public License v3.0
134 stars 50 forks source link

Excel export cannot properly handle custom categories for activities #1246

Open l-reitz opened 4 months ago

l-reitz commented 4 months ago

Updating AB

What happened?

Not sure if this is 100% a bug for AB use only.. I use both BW2 and AB. In code I set categories for some activities to access them easily. When I open these activities in AB and try to excel export/import then

This breaks the linking (in the example for the production exchange) when I try to import again - such that I am asked to provide the database that the exchange links to.

This works again if in the excel file I include the categories in the activity description part. Can the 'categories' key be exported as well, if it is set? This could fix it, I think. test_added_category.xlsx

I tried to just use the BW2 excel export, but then the Databases corresponding to the exchanges are not exported, so the export functions seem to be different for BW2 and AB(?)

Thanks for the help!

Relevant errors

No response

Operating system

Windows 10

Conda environment

dependencies:
  - ab-plugin-scenariolink=0.0.6=py_0
  - activity-browser=2.9.7=py_0
  - anyio=4.3.0=pyhd8ed1ab_0
  - appdirs=1.4.4=pyh9f0ad1d_0
  - argon2-cffi=23.1.0=pyhd8ed1ab_0
  - argon2-cffi-bindings=21.2.0=py39ha55989b_4
  - arrow=1.3.0=pyhd8ed1ab_0
  - asteval=0.9.31=pyhd8ed1ab_0
  - asttokens=2.4.1=pyhd8ed1ab_0
  - astunparse=1.6.3=pyhd8ed1ab_0
  - async-lru=2.0.4=pyhd8ed1ab_0
  - attrs=23.2.0=pyh71513ae_0
  - aws-c-auth=0.7.16=hec1de76_6
  - aws-c-cal=0.6.10=hd481e46_1
  - aws-c-common=0.9.13=hcfcfb64_0
  - aws-c-compression=0.2.18=hd481e46_1
  - aws-c-event-stream=0.4.2=h0f06f08_4
  - aws-c-http=0.8.1=hdb5aac5_5
  - aws-c-io=0.14.5=h08270f9_1
  - aws-c-mqtt=0.10.2=hfea8755_4
  - aws-c-s3=0.5.1=h4b2095a_7
  - aws-c-sdkutils=0.1.15=hd481e46_1
  - aws-checksums=0.1.18=hd481e46_1
  - aws-crt-cpp=0.26.2=hd3b692e_6
  - aws-sdk-cpp=1.11.267=h93f5800_1
  - babel=2.14.0=pyhd8ed1ab_0
  - beautifulsoup4=4.12.3=pyha770c72_0
  - bleach=6.1.0=pyhd8ed1ab_0
  - blinker=1.7.0=pyhd8ed1ab_0
  - boto3=1.34.51=pyhd8ed1ab_0
  - botocore=1.34.51=pyge38_1234567_0
  - bottleneck=1.3.8=py39hd88c2e4_0
  - brightway2=2.4.4=pyhd8ed1ab_0
  - brotli=1.1.0=hcfcfb64_1
  - brotli-bin=1.1.0=hcfcfb64_1
  - brotli-python=1.1.0=py39h99910a6_1
  - brotlicffi=1.1.0.0=py39h99910a6_1
  - bw2analyzer=0.10=pyhd8ed1ab_0
  - bw2calc=1.8.2=py39hcbf5309_0
  - bw2data=3.6.6=pyhd8ed1ab_0
  - bw2io=0.8.10=pyhd8ed1ab_0
  - bw2parameters=1.1.0=pyhd8ed1ab_0
  - bw_migrations=0.2=pyhd8ed1ab_0
  - bzip2=1.0.8=hcfcfb64_5
  - c-ares=1.27.0=hcfcfb64_0
  - ca-certificates=2024.2.2=h56e8100_0
  - cached-property=1.5.2=hd8ed1ab_1
  - cached_property=1.5.2=pyha770c72_1
  - cchardet=2.1.7=py39h99910a6_5
  - certifi=2024.2.2=pyhd8ed1ab_0
  - cffi=1.16.0=py39ha55989b_0
  - chardet=5.2.0=py39hcbf5309_1
  - charset-normalizer=3.3.2=pyhd8ed1ab_0
  - click=8.1.7=win_pyh7428d3b_0
  - colorama=0.4.6=pyhd8ed1ab_0
  - comm=0.2.1=pyhd8ed1ab_0
  - constructive_geometries=0.8.2=pyhd8ed1ab_0
  - contextlib2=21.6.0=pyhd8ed1ab_0
  - contourpy=1.2.0=py39h1f6ef14_0
  - country_converter=1.2=pyhd8ed1ab_0
  - cryptography=42.0.5=py39hb6bd5e6_0
  - cycler=0.12.1=pyhd8ed1ab_0
  - datapackage=1.15.2=pyh44b312d_0
  - debugpy=1.8.1=py39h99910a6_0
  - decorator=5.1.1=pyhd8ed1ab_0
  - defusedxml=0.7.1=pyhd8ed1ab_0
  - dill=0.3.8=pyhd8ed1ab_0
  - docopt=0.6.2=py_1
  - eidl=2.0.1=py_0
  - eight=1.0.1=pyhd8ed1ab_4
  - entrypoints=0.4=pyhd8ed1ab_0
  - et_xmlfile=1.1.0=pyhd8ed1ab_0
  - exceptiongroup=1.2.0=pyhd8ed1ab_2
  - executing=2.0.1=pyhd8ed1ab_0
  - fasteners=0.17.3=pyhd8ed1ab_0
  - flask=3.0.2=pyhd8ed1ab_0
  - fonttools=4.49.0=py39ha55989b_0
  - fqdn=1.5.1=pyhd8ed1ab_0
  - freetype=2.12.1=hdaf720e_2
  - future=1.0.0=pyhd8ed1ab_0
  - gettext=0.21.1=h5728263_0
  - glib=2.78.4=h12be248_0
  - glib-tools=2.78.4=h12be248_0
  - greenlet=3.0.3=py39h99910a6_0
  - gst-plugins-base=1.22.9=h001b923_0
  - gstreamer=1.22.9=hb4038d2_0
  - h11=0.14.0=pyhd8ed1ab_0
  - h2=4.1.0=pyhd8ed1ab_0
  - hpack=4.0.0=pyh9f0ad1d_0
  - httpcore=1.0.4=pyhd8ed1ab_0
  - httpx=0.27.0=pyhd8ed1ab_0
  - hyperframe=6.0.1=pyhd8ed1ab_0
  - icu=73.2=h63175ca_0
  - idna=3.6=pyhd8ed1ab_0
  - ijson=3.2.3=pyhd8ed1ab_0
  - importlib-metadata=7.0.1=pyha770c72_0
  - importlib-resources=6.1.2=pyhd8ed1ab_0
  - importlib_metadata=7.0.1=hd8ed1ab_0
  - importlib_resources=6.1.2=pyhd8ed1ab_0
  - inflate64=1.0.0=py39ha55989b_1
  - intel-openmp=2024.0.0=h57928b3_49841
  - ipykernel=6.29.3=pyha63f2e9_0
  - ipython=8.18.1=pyh7428d3b_3
  - isodate=0.6.1=pyhd8ed1ab_0
  - isoduration=20.11.0=pyhd8ed1ab_0
  - itsdangerous=2.1.2=pyhd8ed1ab_0
  - jdcal=1.4.1=py_0
  - jedi=0.19.1=pyhd8ed1ab_0
  - jinja2=3.1.3=pyhd8ed1ab_0
  - jmespath=1.0.1=pyhd8ed1ab_0
  - json5=0.9.17=pyhd8ed1ab_0
  - jsonlines=4.0.0=pyhd8ed1ab_0
  - jsonpointer=2.4=py39hcbf5309_3
  - jsonschema=4.21.1=pyhd8ed1ab_0
  - jsonschema-specifications=2023.12.1=pyhd8ed1ab_0
  - jsonschema-with-format-nongpl=4.21.1=pyhd8ed1ab_0
  - jupyter-lsp=2.2.3=pyhd8ed1ab_0
  - jupyter_client=8.6.0=pyhd8ed1ab_0
  - jupyter_core=5.7.1=py39hcbf5309_0
  - jupyter_events=0.9.0=pyhd8ed1ab_0
  - jupyter_server=2.12.5=pyhd8ed1ab_0
  - jupyter_server_terminals=0.5.2=pyhd8ed1ab_0
  - jupyterlab=4.1.2=pyhd8ed1ab_0
  - jupyterlab_pygments=0.3.0=pyhd8ed1ab_1
  - jupyterlab_server=2.25.3=pyhd8ed1ab_0
  - kiwisolver=1.4.5=py39h1f6ef14_1
  - krb5=1.21.2=heb0366b_0
  - lcms2=2.16=h67d730c_0
  - lerc=4.0.0=h63175ca_0
  - libabseil=20240116.1=cxx17_h63175ca_1
  - libarrow=15.0.0=hd01637b_7_cpu
  - libarrow-acero=15.0.0=h63175ca_7_cpu
  - libarrow-dataset=15.0.0=h63175ca_7_cpu
  - libarrow-flight=15.0.0=hca4e5ea_7_cpu
  - libarrow-flight-sql=15.0.0=h1ef3bed_7_cpu
  - libarrow-gandiva=15.0.0=hc896d4e_7_cpu
  - libarrow-substrait=15.0.0=hf368baa_7_cpu
  - libblas=3.9.0=18_win64_mkl
  - libbrotlicommon=1.1.0=hcfcfb64_1
  - libbrotlidec=1.1.0=hcfcfb64_1
  - libbrotlienc=1.1.0=hcfcfb64_1
  - libcblas=3.9.0=18_win64_mkl
  - libclang=15.0.7=default_h77d9078_3
  - libclang13=15.0.7=default_h77d9078_3
  - libcrc32c=1.1.2=h0e60522_0
  - libcurl=8.5.0=hd5e4a3a_0
  - libdeflate=1.19=hcfcfb64_0
  - libevent=2.1.12=h3671451_1
  - libffi=3.4.2=h8ffe710_5
  - libglib=2.78.4=h16e383f_0
  - libgoogle-cloud=2.21.0=h2b62511_2
  - libgoogle-cloud-storage=2.21.0=hb581fae_2
  - libgrpc=1.61.1=h943f600_0
  - libhwloc=2.9.1=h51c2c0f_0
  - libiconv=1.17=hcfcfb64_2
  - libjpeg-turbo=3.0.0=hcfcfb64_1
  - liblapack=3.9.0=18_win64_mkl
  - libogg=1.3.4=h8ffe710_1
  - libparquet=15.0.0=h7ec3a38_7_cpu
  - libpng=1.6.43=h19919ed_0
  - libprotobuf=4.25.2=h503648d_1
  - libre2-11=2023.09.01=hf8d8778_2
  - libsodium=1.0.18=h8d14728_1
  - libsqlite=3.45.1=hcfcfb64_0
  - libssh2=1.11.0=h7dfc565_0
  - libthrift=0.19.0=ha2b3283_1
  - libtiff=4.6.0=h6e2ebb7_2
  - libutf8proc=2.8.0=h82a8f57_0
  - libvorbis=1.3.7=h0e60522_0
  - libwebp=1.3.2=hcfcfb64_1
  - libwebp-base=1.3.2=hcfcfb64_0
  - libxcb=1.15=hcd874cb_0
  - libxml2=2.10.4=hc3477c8_0
  - libxslt=1.1.37=h0192164_0
  - libzlib=1.2.13=hcfcfb64_5
  - linear-tsv=1.1.0=py_1
  - llvmlite=0.42.0=py39hd28a505_1
  - lxml=4.9.2=py39h0942119_0
  - lz4-c=1.9.4=hcfcfb64_0
  - m2w64-gcc-libgfortran=5.3.0=6
  - m2w64-gcc-libs=5.3.0=7
  - m2w64-gcc-libs-core=5.3.0=7
  - m2w64-gmp=6.1.0=2
  - m2w64-libwinpthread-git=5.0.0.4634.697f757=2
  - markupsafe=2.1.5=py39ha55989b_0
  - matplotlib-base=3.8.3=py39hf19769e_0
  - matplotlib-inline=0.1.6=pyhd8ed1ab_0
  - mistune=3.0.2=pyhd8ed1ab_0
  - mkl=2022.1.0=h6a75c08_874
  - mrio_common_metadata=0.2.1=pyhd8ed1ab_0
  - msys2-conda-epoch=20160418=1
  - multiprocess=0.70.16=py39ha55989b_0
  - multivolumefile=0.2.3=pyhd8ed1ab_0
  - munkres=1.1.4=pyh9f0ad1d_0
  - nbclient=0.8.0=pyhd8ed1ab_0
  - nbconvert-core=7.16.1=pyhd8ed1ab_0
  - nbformat=5.9.2=pyhd8ed1ab_0
  - nest-asyncio=1.6.0=pyhd8ed1ab_0
  - networkx=3.2.1=pyhd8ed1ab_0
  - notebook-shim=0.2.4=pyhd8ed1ab_0
  - numba=0.59.0=py39h6836801_1
  - numpy=1.23.5=py39hbccbffa_0
  - openjpeg=2.5.1=h3d672ee_0
  - openpyxl=3.1.2=py39ha55989b_1
  - openssl=3.2.1=hcfcfb64_0
  - orc=1.9.2=h2702c50_2
  - overrides=7.7.0=pyhd8ed1ab_0
  - packaging=23.2=pyhd8ed1ab_0
  - pandas=2.1.4=py39h32e6231_0
  - pandocfilters=1.5.0=pyhd8ed1ab_0
  - parso=0.8.3=pyhd8ed1ab_0
  - pathos=0.3.2=pyhd8ed1ab_1
  - patsy=0.5.6=pyhd8ed1ab_0
  - pcre2=10.42=h17e33f8_0
  - peewee=3.17.1=py39habf9222_0
  - pickleshare=0.7.5=py_1003
  - pillow=10.2.0=py39h368b509_0
  - pint=0.21=pyhd8ed1ab_0
  - pip=24.0=pyhd8ed1ab_0
  - pkgutil-resolve-name=1.3.10=pyhd8ed1ab_1
  - platformdirs=4.2.0=pyhd8ed1ab_0
  - pox=0.3.4=pyhd8ed1ab_0
  - ppft=1.7.6.8=pyhd8ed1ab_0
  - premise=1.8.1=pyhd8ed1ab_0
  - premise_gwp=0.9.6=pyhd8ed1ab_0
  - prettytable=3.10.0=pyhd8ed1ab_0
  - prometheus_client=0.20.0=pyhd8ed1ab_0
  - prompt-toolkit=3.0.42=pyha770c72_0
  - psutil=5.9.8=py39ha55989b_0
  - pthread-stubs=0.4=hcd874cb_1001
  - pthreads-win32=2.9.1=hfa6e2cd_3
  - pure_eval=0.2.2=pyhd8ed1ab_0
  - py7zr=0.20.8=pyhd8ed1ab_1
  - pyarrow=15.0.0=py39h0a09291_7_cpu
  - pybcj=1.0.2=py39ha55989b_0
  - pycountry=22.3.5=pyhd8ed1ab_0
  - pycparser=2.21=pyhd8ed1ab_0
  - pycryptodomex=3.19.0=py39ha55989b_1
  - pygments=2.17.2=pyhd8ed1ab_0
  - pypardiso=0.4.4=pyhd8ed1ab_0
  - pyparsing=3.1.1=pyhd8ed1ab_0
  - pyperclip=1.8.2=pyhd8ed1ab_2
  - pyppmd=1.1.0=py39h99910a6_0
  - pyprind=2.11.2=py39hcbf5309_1005
  - pyrsistent=0.14.11=py39hb82d6ee_2
  - pyside2=5.15.8=py39h56cbfbc_2
  - pysocks=1.7.1=pyh0701188_6
  - python=3.9.18=h4de0772_0_cpython
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python-fastjsonschema=2.19.1=pyhd8ed1ab_0
  - python-json-logger=2.0.7=pyhd8ed1ab_0
  - python-tzdata=2024.1=pyhd8ed1ab_0
  - python_abi=3.9=4_cp39
  - pytz=2024.1=pyhd8ed1ab_0
  - pywin32=306=py39h99910a6_2
  - pywinpty=2.0.13=py39h99910a6_0
  - pyxlsb=1.0.10=pyhd8ed1ab_0
  - pyyaml=6.0.1=py39ha55989b_1
  - pyzmq=25.1.2=py39h7eaf5a6_0
  - pyzstd=0.15.9=py39h95af829_1
  - qt-main=5.15.8=h9e85ed6_19
  - qt-webengine=5.15.8=h4bf5c4e_4
  - re2=2023.09.01=hd3b24a8_2
  - referencing=0.33.0=pyhd8ed1ab_0
  - requests=2.31.0=pyhd8ed1ab_0
  - rfc3339-validator=0.1.4=pyhd8ed1ab_0
  - rfc3986=2.0.0=pyhd8ed1ab_0
  - rfc3986-validator=0.1.1=pyh9f0ad1d_0
  - rpds-py=0.18.0=py39hf21820d_0
  - s3transfer=0.10.0=pyhd8ed1ab_0
  - salib=1.4.8=pyhd8ed1ab_0
  - schema=0.7.5=pyhd8ed1ab_1
  - scipy=1.12.0=py39hddb5d58_2
  - seaborn=0.13.2=hd8ed1ab_0
  - seaborn-base=0.13.2=pyhd8ed1ab_0
  - send2trash=1.8.2=pyh08f2357_0
  - setuptools=69.1.1=pyhd8ed1ab_0
  - setuptools-scm=8.0.4=pyhd8ed1ab_0
  - six=1.16.0=pyh6c4a22f_0
  - snappy=1.1.10=hfb803bf_0
  - sniffio=1.3.1=pyhd8ed1ab_0
  - soupsieve=2.5=pyhd8ed1ab_1
  - sparse=0.15.1=pyhd8ed1ab_1
  - sqlalchemy=2.0.27=py39ha55989b_0
  - stack_data=0.6.2=pyhd8ed1ab_0
  - stats_arrays=0.6.6=pyhd8ed1ab_0
  - statsmodels=0.14.1=py39hd88c2e4_0
  - tableschema=1.20.2=pyh44b312d_0
  - tabulate=0.9.0=pyhd8ed1ab_1
  - tabulator=1.53.5=pyhd8ed1ab_0
  - tbb=2021.9.0=h91493d7_0
  - terminado=0.18.0=pyh5737063_0
  - texttable=1.7.0=pyhd8ed1ab_0
  - tinycss2=1.2.1=pyhd8ed1ab_0
  - tk=8.6.13=h5226925_1
  - tomli=2.0.1=pyhd8ed1ab_0
  - toolz=0.12.1=pyhd8ed1ab_0
  - tornado=6.4=py39ha55989b_0
  - tqdm=4.66.2=pyhd8ed1ab_0
  - traitlets=5.14.1=pyhd8ed1ab_0
  - types-python-dateutil=2.8.19.20240106=pyhd8ed1ab_0
  - typing-extensions=4.10.0=hd8ed1ab_0
  - typing_extensions=4.10.0=pyha770c72_0
  - typing_utils=0.1.0=pyhd8ed1ab_0
  - tzdata=2024a=h0c530f3_0
  - ucrt=10.0.22621.0=h57928b3_0
  - unfold=2023.10.27=py_0
  - unicodecsv=0.14.1=pyhd8ed1ab_2
  - unicodedata2=15.1.0=py39ha55989b_0
  - unidecode=1.3.8=pyhd8ed1ab_0
  - uri-template=1.3.0=pyhd8ed1ab_0
  - urllib3=1.26.18=pyhd8ed1ab_0
  - vc=14.3=hcf57466_18
  - vc14_runtime=14.38.33130=h82b7239_18
  - voluptuous=0.14.2=pyhd8ed1ab_0
  - vs2015_runtime=14.38.33130=hcb4865c_18
  - wcwidth=0.2.13=pyhd8ed1ab_0
  - webcolors=1.13=pyhd8ed1ab_0
  - webencodings=0.5.1=pyhd8ed1ab_2
  - websocket-client=1.7.0=pyhd8ed1ab_0
  - werkzeug=3.0.1=pyhd8ed1ab_0
  - wheel=0.42.0=pyhd8ed1ab_0
  - whoosh=2.7.4=py39hcbf5309_8
  - win_inet_pton=1.1.0=pyhd8ed1ab_6
  - winpty=0.4.3=4
  - wrapt=1.16.0=py39ha55989b_0
  - wurst=0.3.4=pyhd8ed1ab_0
  - xarray=2024.2.0=pyhd8ed1ab_0
  - xlrd=2.0.1=pyhd8ed1ab_3
  - xlsxwriter=3.1.9=pyhd8ed1ab_0
  - xorg-libxau=1.0.11=hcd874cb_0
  - xorg-libxdmcp=1.1.3=hcd874cb_0
  - xz=5.2.6=h8d14728_0
  - yaml=0.2.5=h8ffe710_2
  - zeromq=4.3.5=h63175ca_0
  - zipfile-deflate64=0.2.0=py39ha55989b_4
  - zipp=3.17.0=pyhd8ed1ab_0
  - zstd=1.5.5=h12be248_0
mrvisscher commented 4 months ago

Hi @l-reitz,

Thank you for your message, sorry to hear you're having problems with the AB. Let's see how we can make the export as complete as possible. I don't understand your last comment yet:

I tried to just use the BW2 excel export, but then the Databases corresponding to the exchanges are not exported, so the export functions seem to be different for BW2 and AB(?)

Could you send me the BW2 excel export you generated so I can compare?

Kind regards,

Marin

l-reitz commented 4 months ago

Hi Marin,

about the BW2 export: the column "Database" of the exchanges is just not there in the excel file. See the files 'lci-test.xlsx' for BW2 export and 'test_AB_export.xlsx' for direct comparison of the export of my test database either from BW2 or AB. lci-test.xlsx test_AB_export.xlsx The import in AB of the file 'test_AB_export' works fine. I think that is more of a BW2 issue, but I haven't asked around there because AB export did work for me - for the most part :)

As stated above: If I add categories to my activities, so I can easily modify them in code the AB import breaks. The AB export after adding categories is test_AB_export_added_category.xlsx Where the linking breaks, this file cannot be imported. If I manually add the category in the activity description as well, then the import works again. test_AB_export_added_category_manually.xlsx

Best, Lavinia


In case more info on the categories could still help, I include here also the notebook code where I modify the test database activity to the new test_with_categories database.

#!/usr/bin/env python
# coding: utf-8

# In[1]:

import bw2data as bd
import bw2analyzer as ba
import bw2io as bi
bd.projects.set_current('test')

# In[9]:

test = bd.Database("test")
test_with_categories = bd.Database("test_with_categories")

# In[10]:

if test_with_categories.name not in bd.databases:
    test_with_categories.register()

# In[11]:

bd.databases

# # Implementation of the test LCI
# Implemented in AB first. Just the blank activity with its production exchange and for testing one biosphere exchange.  
# Exported from AB to 'test_AB_export.xlsx'. This can be imported again. 

# # Export of the test LCI 

# In[12]:

bi.export.excel.write_lci_excel(test.name)

# The BW2 export lci-test.xlsx does not contain the 'Database' column for the exchanges. The import in AB does not work, would be weird if it did - how would it know which database to use for what exchange. 

# # Custom categories in code 

# In[13]:

act = [act for act in test][0]
new_act = act.copy()
new_act['database'] = test_with_categories.name
new_act['categories'] = ['MyCategory']
new_act.save()
new_act

# In[14]:

new_act_2 = act.copy()
new_act_2['database'] = test_with_categories.name
new_act_2.save()
new_act_2

# In[15]:

new_exc = new_act_2.new_exchange(**{
    'name': 'technosphere exchange with category',
    'input': new_act,
    'amount': 1.0, 
    'type': 'technosphere',
    'unit': 'unit',
}
)
new_exc.save()

# This database can be found in the AB project as well, no problems. If exported in AB, no problems as well, the custom categories are exported as well in the exchanges - but not in the top part on the activity definition. Thus they cannot be imported and prooperly linked again: the categories mentioned in the production and technosphere exchange are not matching the activity description.