VIDA-NYU / domain_discovery_tool

This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better understand a domain (or topic) as it is represented on the Web.
http://domain-discovery-tool.readthedocs.io/en/latest/index.html
GNU General Public License v3.0
47 stars 18 forks source link

refreshing of snippets during search query is confusing and annoying #42

Closed julianafreire closed 7 years ago

julianafreire commented 7 years ago

It is confusing (and annoying) when the query results are being retrieved and their order keeps changing on the screen.

This is a problem especially when we start to label pages, and their snippets keep moving.

Can we "fix" the order at least for the first page? (E.g., once we have the first 12 URLs, show them in first search result page, and leave them there)

yamsgithub commented 7 years ago

Hi Juliana:

I think I figured out the problem and it is mostly solved. The problem was that we were returning the results from elasticsearch(es) as a dictionary with the id of the page as the key. Dictionaries of course do not maintain the order of the pages. That is why we saw so many out of synch snippet updates. So ideally on the client we should be ordering them by some criteria timestamp or rank.

Added to this is the fact that es pagination does a sort of the index across shards and sends the results corresponding to the page number. Unfortunately the timestamp that we store seems to be out of synch with the criteria that es is using to sort the pages. So I have now added an "order" field to the pages sent to the client. I am using this field to order the pages on the client and it seems to be working.

I have checked in the fixes and am in the process of pushing the new docker image. Would be great if you could test it.

Thanks Yamuna

On 13 June 2017 at 20:41, Juliana Freire notifications@github.com wrote:

It is confusing (and annoying) when the query results are being retrieved and their order keeps changing on the screen.

This is a problem especially when we start to label pages, and their snippets keep moving.

Can we "fix" the order at least for the first page? (E.g., once we have the first 12 URLs, show them in first search result page, and leave them there)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ViDA-NYU/domain_discovery_tool_react/issues/42, or mute the thread https://github.com/notifications/unsubscribe-auth/AD70JjO-qkD-SxiiUAoHus1BOvnm2Hvsks5sDyxMgaJpZM4N5P_j .

yamsgithub commented 7 years ago

This was fixed by commit to DD API ViDA-NYU/domain_discovery_API@be1e399