Automattic / bugomattic

Bugomattic is a tool that guides bug reporters to the right actions within large, complex organizations
GNU General Public License v2.0
6 stars 0 forks source link

not all results showing #150

Closed Nic-Sevic closed 11 months ago

Nic-Sevic commented 12 months ago

Quick summary

When I try to use this to search it only shows me about 20 results even though there are more

possibly related, in the filters it seems like not all the repos are there. For example, what about jpop-issues?

Steps to reproduce

  1. search "tiled gallery photon"
  2. look at results
  3. where is Tiled Gallery removes photon link when gallery links to media file ? https://github.com/Automattic/jetpack/issues/9202

Browser

No response

Other notes

No response

dpasque commented 12 months ago

Hi @Nic-Sevic! 👋

Thanks so much for opening an issue! 😄

When I try to use this to search it only shows me about 20 results even though there are more. where is Tiled Gallery removes photon link when gallery links to media file ?

So, I did some digging into this, and the answer is a bit complicated... 😅 TL;DR -- the issue can be found with Bugomattic search, it's being a bit buried (intentionally) by the design of Bugomattic search.

The design of Bugomattic issue search is really opinionated in its design with the goal of trying to broadly search for things that are most likely to be duplicate issues. This opinionated design definitely comes with some tradeoffs though! But our hope is that this positions Bugomattic well among other search tools. It's not as good at finding older issues as MGS, and it's not as good at diving really deep into a single repo as GitHub search, but our goal is to have it beat both of those out for broadly searching for current potential duplicates!

That issue you mentioned is in fact in the search data, it was just being beat out by a lot of other issues! If you add one more term from the title tiled gallery removes photon, you'll see it there around position 5 or so.

In the Bugomattic search logic, there is a pretty healthy de-emphasis applied to older issues, because they are less likely to be true duplicates of what is being reported. I'm pretty sure that is what is hurting this specific issue's score so much -- it's over 5 years old, so other issues that still have those keywords in them but are newer are edging it out!

We obviously want to make sure though that the search logic is working well for everyone! Can you tell me more about this issue? Was it in fact a duplicate of the issue you wanted to report?

it only shows me about 20 results even though there are more

This is actually another intentional part of the design! From our initial interviews, we found that most users preferred to fire off a lot of different searches, and often didn't read past the first 10-15 results. So we designed Bugomattic search to match and encourage that approach.

But we also don't want to leave people out! When you're doing your duplicate searching, do you often find yourself going back several pages into the results?

possibly related, in the filters it seems like not all the repos are there. For example, what about jpop-issues?

We're actually working on redesigning the filter options right now to make it more expansive and flexible! The issue is #141, and a lot of the discussion is happening on pciE2j-2fC-p2. We'd love your two cents there if you have time! 😄

Nic-Sevic commented 11 months ago

hey @dpasque to address your questions:

Can you tell me more about this issue? Was it in fact a duplicate of the issue you wanted to report?

It wasn't an exact duplicate but it was the closest match. Also, half the results returned seem to only be matching on tiled gallery (or something similar to that) and don't include photon when I search tiled gallery photon. This is true even then I put tiled gallery in quotes (which I would expect to indicate that it's a single term and shouldn't be split). Maybe it appears later in the issue description but it doesn't in the title or short description?

This is actually another intentional part of the design! From our initial interviews, we found that most users preferred to fire off a lot of different searches, and often didn't read past the first 10-15 results. So we designed Bugomattic search to match and encourage that approach.

When you're doing your duplicate searching, do you often find yourself going back several pages into the results?

I don't get an option for multiple pages when I search, only the one page, which then forces me to use GitHub search because I know there are more results and I can't access them. I can see how it's useful to only look at the top 15 but not being able to see any more than that is really hindering. I've found bugomattic to be pretty ineffective for bug searching and have basically had to resort to direct github searches

I'll add further comments on the p2 👍

dpasque commented 11 months ago

@Nic-Sevic thank you so much for the clarification! 🙂 I was able to dig in more, and there's actually a few different things coming together here in this case! As an FYI, I've gone and broken them all out into their own issues, and am going to close this one in favor of handling those all separately.

So, some of the things going on...

I can see how it's useful to only look at the top 15 but not being able to see any more than that is really hindering.

For sure -- this is helpful to know. We took that initial design approach on the assumption that if it ever started to feel like a nuisance or blocker, we'll roll it back and add pagination. That issue is tracked in #154. 👍

don't include photon when I search tiled gallery photon

This threw me for a second, but I've figured out what this is! We added a little bit of fuzziness to the search. This is over-simplified, but for most words, you're allowed one character typo.

Unfortunately, photon is one character off of photos, and photos seems to be a much more common word in our indices, which is how some of those results were sneaking in. 😕 Based on a lot of our initial research, I think it's overall more helpful to allow that little bit of fuzziness, but I think we should find ways to soften that or to be able to skip that when you have cases like this! I've spun up issue #156 to track that.

I put tiled gallery in quotes (which I would expect to indicate that it's a single term and shouldn't be split)

Ha, it's supposed to do that for sure! 😅 I actually thought this was handled by default in ES, but something clearly seems to be off. I've written this up as a bug report in #155. I'll investigate that and see if we can get that working! And nice catch! 😄

I've found bugomattic to be pretty ineffective for bug searching and have basically had to resort to direct github searches

Obviously, I'm really sorry to hear this is the case! 😞 And I really appreciate you taking the time to share all this feedback, because I really want to get it to the place where this isn't true!

Is there anything other than the things you've mentioned here that is holding it back for you? And of all of these pieces, is there any single piece that is the biggest contributor to the negative experience?

dpasque commented 11 months ago

(Closing in favor of all the different issues we've broken it off into. Feel free to keep continuing the discussion here though!)

Nic-Sevic commented 11 months ago

@dpasque thank you for separating those out

Is there anything other than the things you've mentioned here that is holding it back for you? And of all of these pieces, is there any single piece that is the biggest contributor to the negative experience?

There's nothing else I can think of for now but if I think of more I'll be sure to follow up.

As for the biggest piece, I think it's the not getting all the results. In general, when I look for an issue I'm prepared to look through a few pages of results because I know that the way I'm describing the issue likely doesn't match how others have or I don't know what actual feature the issue is tied to. I'm looking for matching symptoms across many products/features. An opinionated search is great for potentially decreasing the results I need to look through but if it results in me having to do multiple searches it ends up taking more time IMO