Quality ranking - Githubissues

mediachain / mediachain-indexer

search, dedupe, and media ingestion for mediachain

33 stars 14 forks source link

Quality ranking #11

Closed parkan closed 7 years ago

parkan commented 8 years ago

catchall issue for thoughts on quality + relevance across different datasets

incorporating extrinsic/source-specific signals (likes, comments, position in original results for query, etc) -- how to normalize these?
intrinsic signals (?)

autoencoder commented 8 years ago

Normalizing - Quickest easiest way to start: Already have a basic hyper-parameter optimizer setup in the Indexer for dedupe tasks. If we can find or create a query relevancy ratings dataset (e.g. #1?), then we can hook this up to the hyper-parameter optimizer for search too.

parkan commented 8 years ago

Sounds like there's some fairly discrete dataset work here, feel free to plan that out and assign as you see fit. I am down to help

parkan commented 7 years ago

I think this is at least partially covered through the quality models + unsplash classifier, closing