atopal commented 5 months ago

It should be possible to have an overview of features by filtering for them with something like "avif, popover, ..."

foolip commented 5 months ago

I think all that's missing is searching by identifier. There's already OR to string them together.

jcscottiii commented 5 months ago

The name filter (with or without the name: prefix) does a partial string search on both the name and the id. (We call the feature identifier featurekey at the database level of our system due to the database tables already having columns called ID)

But we don't make it explicitly clear that the name filter can also search by ID. We can modify the description of that filter.

@foolip is correct that the OR operator can help with this.

avif OR popover

foolip commented 5 months ago

The substring matching means that the matched features could change, for example "avif" could start matching "avif2" if there's ever a future format named as such.

A way to enumerate a list of exact identifiers would solve this, but there's still a problem. How do you maintain such a query and add/remove features? Any links previously shared will always point to that snapshot, after all.

Some kind of named saved queries? Or should we have tags and the ability to filter by them?

foolip commented 5 months ago

https://webstatus.dev/?q=cap is a concrete case of the substring problem. I only want to match https://webstatus.dev/features/cap but 3 other features are included too.

foolip commented 5 months ago

Today I tried to create a view of 100+ "rendering" features and dealing with the problem created by substring matches proved quite hard. Lots of things are substring matches for other things, and I had to edit the query quite a bit and count the number of matches to build confidence in it.

A new id search atom that is always matched exactly would help here I think. Ideally the name atom then matches only on the name, not the identifier.

Supporting something like tag:my-favorite-features would be a nice addition, but we'd need to maintain that data somewhere, and the question of ownership of a tag would arise. It seems best to first ensure that an exact set of features can be pulled out using very long query strings, and thinking about ways to shorten it later.

atopal commented 5 months ago

Ah, the OR operator is case sensitive, that tripped me up.

Let's make this issue about the problem that Philip mentioned instead. Name doing a substring match is great, but yes, let's add another option that gives exact matches.

Re saved searches: That would be possible for logged in users and then you could namespace it "kadirtopal/top-issues" or similar.

jcscottiii commented 1 month ago

Steps:

Modify the FeatureSearch.g4 to add a new term id.
Run make gen -B to regenerate all the generated code
Add a new Identifier, called IdentifierID.
- Background: When we build our search tree from the query, each node has an Identifier
Open the generated file lib/gen/featuresearch/parser/antlr/featuresearch_visitor.go to find the signature of your new method for your new term. Should be something like VisitID_term
Copy this method and attach it to our implementation of the Visitor in features_search_visitor.go. We will come back to writing code for it in step 7.
In the same file, create a reusable helper called createIDNode. It should simply call the existing createSimpleNode. Look at other createXxx methods.
Back to the VisitID_term method, call your new helper now.
- Retrospect: Since this particular Visit method is so short, you probably could just call createSimpleNode here.
Now it's time for some tests. Both tests that 1) assert that the query can be parsed (parse tests) and build a search tree (database tests) , and 2) that the database can use the search tree and find the data. Luckily, our database tests already have features inserted. So we can pull IDs from there. Then we can use those IDs in the parse tests to build the expected tree. Then we can use that exact tree in the new database tests. Examine the database tests below. In the Spanner SQL database, we actually call ID, FeatureKey because ID is reserved for the primary key for the row (which is an ID). Remember the FeatureKeys you have available: https://github.com/GoogleChrome/webstatus.dev/blob/684a2857d5acd7c15911510c991d5281414f80b5/lib/gcpspanner/feature_search_test.go#L45-L66
Go to the parse test in features_search_parse_test.go. Copy an existing simple test case. Like this one: https://github.com/GoogleChrome/webstatus.dev/blob/684a2857d5acd7c15911510c991d5281414f80b5/lib/gcpspanner/searchtypes/features_search_parse_test.go#L672-L688. Just change the InputQuery to id:FeatureKeyYouSelected. run the tests by scrolling to the top of the method in vscode and select Run Test. The test should fail. But show you the difference. Fix the difference then re-run the tests to succeed. You will use this fixed tree later on in step X
Now, we need to have the database use the query feature_search_query.go.
- Create a new filter method attached to FeatureSearchFilterBuilder called idFilter. Take a look at snapshotFilter for inspiration: https://github.com/GoogleChrome/webstatus.dev/blob/684a2857d5acd7c15911510c991d5281414f80b5/lib/gcpspanner/feature_search_query.go#L311-L331
- Normalize the input to lower case with something like snapshotFilter
- Create the paramName that will be used in the query and point to the normalized value.
- Use opStr := searchOperatorToSpannerBinaryOperator(op) to get the operator (since we only do = or !=)
- build a query string. We can go more into this later on but you can assume that you have access to WebFeatures table via an alias called wf https://github.com/GoogleChrome/webstatus.dev/blob/684a2857d5acd7c15911510c991d5281414f80b5/infra/storage/spanner/migrations/000001.sql#L16-L23. Something along the lines: `fmt.Sprintf("( wf.FeatureKey_Lowercase %s @%s)", opStr, paramName). Use the computed Lowercase.
Now we need to add a database test. Follow this example from the snapshot database test. (you do not need to insert features. And you are already using a search tree using IDs/FeatureKeys from the inserted features.

For reference, take a look at how we added support for the snapshot term in this PR: https://github.com/GoogleChrome/webstatus.dev/pull/601

jcscottiii commented 4 weeks ago

Two more things:

Thing 1

we should add a playwright test that always checks that we can support 50 IDs in one query.

The test should input the query and assert that the dashboard returns 50 features

Thing 2

We need to add the new term to the frontend's vocabulary:

https://github.com/GoogleChrome/webstatus.dev/blob/e2ee4b34698c9882813f59f0f040d0ba727dcd02/frontend/src/static/js/components/webstatus-overview-filters.ts#L61-L134

jcscottiii commented 3 weeks ago

The work for this has been split up into #786 #787 #788

GoogleChrome / webstatus.dev

Create new `id` term to do exact matching #297

Thing 1

Thing 2