fjuniorr / flowmapper-ci

Bot for running flowmapper
0 stars 1 forks source link

Add config for Agirbalyse matching #5

Closed cmutel closed 11 months ago

fjuniorr commented 11 months ago

@cmutel we can simply use the same field mapping config but point to data/agribalyse-3.1.1-biosphere.json

python main.py map data/agribalyse-3.1.1-biosphere.json \
                   data/ElementaryExchanges-3.7.json \
                   config/simapro-flows-ElementaryExchanges-3.7.toml

which yields (for https://github.com/fjuniorr/flowmapper/commit/3393ac21889c7fc7131a9c3b6188d0ab778e4dae):

100%|██████████████████████████████████████████████████████████████████| 5667/5667 [01:22<00:00, 68.30it/s]
5667 unique source flows...
4329 unique target flows...
3295 mappings of 3262 unique source flows (57.56% of total).
fjuniorr commented 11 months ago

I've cherry picked the more complete .gitconfig and otherwise I think we are good to close this. Sounds good @cmutel ?

cmutel commented 11 months ago

The difference between the configs is small - just cas = ["CAS", "@casNumber"] versus cas = ["", "@casNumber"].

I know we can use the existing config and it will work (it looks for the CAS column, which doesn't exist, so never returns a match, but it feels a bit weird to me to purposefully choose a config which is looking for a column we know isn't there... But ok, it works for now. At some point I assume we will want to add different config files for common use cases and we can revisit this.

fjuniorr commented 11 months ago

I agree that this can cause confusion. I was very focused on the fact that

  {
    "name": "1-Propanol, i-3,3,3-trifluoro-2,2-bis(trifluoromethyl)-, i-HFE-7100",
    "categories": [
      "Air",
      "(unspecified)"
    ],
    "unit": "kg",
    "CAS": ""
  }

or

  {
    "name": "1-Propanol, i-3,3,3-trifluoro-2,2-bis(trifluoromethyl)-, i-HFE-7100",
    "categories": [
      "Air",
      "(unspecified)"
    ],
    "unit": "kg",
  }

in a given flowlist should be handled in the same way.

But ok, it works for now. At some point I assume we will want to add different config files for common use cases and we can revisit this.

👍