brightway-lca / brightway2-data

Tools for the management of inventory databases and impact assessment methods. Part of the Brightway LCA framework.
https://docs.brightway.dev/
BSD 3-Clause "New" or "Revised" License
8 stars 22 forks source link

location filter uses lowercase only and ignores locations with dashes #35

Closed aleksandra-kim closed 6 years ago

aleksandra-kim commented 8 years ago

Original report by Pascal Lesage (Bitbucket: MPa, ).


Two cases were found:

1) Filtering on location using uppercase (as they are found in the activity) returns onthing (i.e. an empty list). Changing to lowercase works, even though in reality e.g. 'DE' != 'de'

2) Filtering on locations with dashes (as they are found in the activities) returns nothing (i.e. an empty list). Changing to lowercase in this case doesn't change anything, e.g. 'ca-qc', CA-QC', 'caqc', CAQC' all return nothing.

aleksandra-kim commented 8 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).


CA-QC is a separate issue, as - is a special character in the search engine. Fixing this will be complicated, and is a bit too much off the path for Brightway2 development, at least for me. You can always do the following:

#!python

[x for x in Database("foo").search("bar", filter={'location': "ca"}) if x['location'] == 'QC-CA']

I will make a post to the Whoosh mailing list about the first case; note that the default analyzer is case-insensitive, but this is maybe applied inconsistently?

aleksandra-kim commented 8 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).


Post to Whoosh list

aleksandra-kim commented 8 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).


Reply from Matt Chaput (whoosh author):

When you use the query parser, it runs the analyzer on the text and builds the query objects from the output.
If you construct query objects directly, you are giving them the *exact* text to search for.

Our search filter uses a query object:

And([Term(k, v) for k, v in filter.items()])

However, it is not yet clear to me how to work with the Whoosh machinery to make this case-insensitive.

aleksandra-kim commented 6 years ago

Original comment by Bernhard Steubing (Bitbucket: bsteubing, GitHub: bsteubing).


also the "mask" option does only work with lowercase. Knowing this limitation, couldn't you simply convert all user input to location, mask, etc. to lowercase? Just an idea.

aleksandra-kim commented 6 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).


All search data stored as lowercase. Fixes #35. To get this beahviour in an existing database, use 'db.make_searchable(reset=True'

aleksandra-kim commented 6 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).


Note that the fix doesn't help terms that have a dash, these have to be filtered outside the Whoosh search index.