MaayanLab / FAIRshake

https://fairshake.cloud
Other
11 stars 6 forks source link

Improve fuzzy url matching throughout application #92

Open u8sand opened 6 years ago

u8sand commented 6 years ago

Might make sense to make a dedicated module and tests for this. Some ideal features of good URL matching:

u8sand commented 6 years ago

In general, searches could be more efficient for things like this and more if we migrate our database to PostgreSQL and take advantage of its features. (https://docs.djangoproject.com/en/2.1/ref/contrib/postgres/search/#searchrank)

Another thing which could be done to help the search along is to explicitly split urls into their components:

import urllib
parsed_url = urllib.parse.urlparse(url)

urllib.parse.urlparse('http://me:mine@google.com/hello/world?q=hi&b=bye#c=f')
# ParseResult(scheme='http', netloc='me:mine@google.com', path='/hello/world', params='', query='q=hi&b=bye', fragment='c=f')