chris-mosley / AmazonBrandFilter

Filters Amazon search results to only be "known" brands.
MIT License
43 stars 5 forks source link

Expand this concept into an "Amazon power search"? #8

Closed allquixotic closed 8 months ago

allquixotic commented 9 months ago

AmazonBrandFilter is a very useful tool! But it is just the start of what I think could be an even more useful tool for Amazon shoppers.

This is a long, sort of "meta-issue" that I hope will start a discussion about how to possibly expand upon this tool, and weigh some design decisions at the top level before this project becomes too big. I think it's likely to gain a lot of traction soon, and it will be good if the community of users and developers around this project have discussed where we are, and where we're going, before this hits Slashdot or Ars Technica.

Thesis

I observe that:

As a result, shopping for seemingly simple household items can become a real chore. Finding a product that meets your needs, can physically (or digitally, in the case of software) perform exactly as needed, for as long as needed, at a price you can afford, that can be delivered in the time you need it, is an extremely hard problem to solve.

The fact is, Amazon has come closer than anyone in human history at solving that problem for most people in the industrialized world.

But, just like the Internet in general, we're suffering from information overload. There's just too much stuff, and it takes too much of our time to sift through it all to find the "signal" -- the brands that stand behind their products, are made in countries we want to support with our business, with the right interests in sustainability and fair labor, with attention to quality and reliability, and without gouging the consumer with excessive prices.

Considerations in Developing Solutions

When thinking about how to solve these hard problems, we in the software engineering community can envision many technical approaches. But a few common themes that come to mind include:

The 20th century offers us little in the way of solutions. Organizations such as the Better Business Bureau and Consumer Reports might purport to be helpful in this area, but they are seriously flawed and mismatched with the current marketplace:

One Possible Solution

Given all this background, here is one possible approach to a solution that makes some compromises while trying to provide maximum "return on investment" for the community's time and effort:

Filter Maintainer Trust

The question we should be asking is, what is the means by which we can maximize confidence that someone is submitting data in good faith (without the intent to trick us into whitelisting junk brands), while minimizing the barrier to entry? These two factors seem to be directly related; meaning, an increase in confidence comes with an increase in the barrier to entry, while reducing the barrier to entry will often (but perhaps not always) chop down our ability to detect and combat bad actors.

As a first attempt, perhaps we should require submitters to provide evidence of the reputability of a brand? This doesn't require them to have purchased anything; they just have to link to something like a website, Wikipedia page, etc. that helps to solidify the brand's identity.

By putting the burden of proof on submitters, this will save time for the "core team" (the PR reviewers / those with committer access) so they don't have to go doing research on their own.

Data Quality / Data Maintenance

We also have the concern of data quality independent of trust. Someone can, in perfectly good faith, submit a brand that turns out to be junk. Maybe they're excited because they just received a cool product, and immediately PR the company's name, only to find out that their product broke on day 31 of ownership, right after the return period, and the company is nowhere to be found when trying to RMA it... oops.

Over time, as the list expands, it will become increasingly difficult to "prune" the list of companies that either (a) used to be fine, but have become junk brands (or otherwise no longer fit our criteria for whitelisting); or (b) no longer exist -- bankruptcy, renaming, re-branding, mergers and acquisitions, etc.

In the case of both establishing submitter trust and continual data maintenance, assuming we're keeping this fully "human in the loop" (no AI/ML), we will need a growing core team to review both new submissions and the existing dataset.

Oh, and, if we decide to expand this tool beyond just keeping a whitelist of brands, that vastly increases the volume of data we have to curate, which is a huge ask for even a moderately sized team of dedicated humans. Yikes.

To help with this burden, we could perhaps employ some technology. It's easy enough to determine if a brand is still relevant: if you search Amazon for that brand and don't get any (good) results, or all their products are indefinitely out of stock, that's a sign that you can count them out. Then we can have manual reviewers look at the shortlist of companies that came up "empty" and figure out which ones to nuke from the list to shorten it.

Multi-faceted Data Warehouse

This one will require a lot of coding, data collection and maintenance, but I think the minimum important data to collect (to allow users to filter on using the browser extension) are:

I'm sure folks might also be interested in things like company ESG initiatives (Environmental, Social and Governance), sustainable packaging, etc. but these are much harder to determine/collect outside of a few big brands that make a big deal out of this, like Apple and Patagonia.

Once we collect this data, we can implement UI elements to let folks narrow down their searches to exactly what they're looking for. But we might not be able to justify expanding the data like this until we have a much larger, more vibrant product/community, where we'd have the development, review and data submission resources to make it happen.

Conclusion / My Help

For now, I'm mainly going to contribute by occasionally sending in PRs to the filter list, until it's clearer what the overall direction of the project will be.

But, if this gains the attention that I think it legitimately deserves, it might go a lot further than that. I've got decent JS client-side development experience, and can probably help with designing the data collection process for a multi-faceted data warehouse that isn't just a text file of brand names (not that this is a bad start; it isn't! This is more than we had before, which was nothing!)

Anyway, thanks for working on this project.

chris-mosley commented 8 months ago

First off. It's making me SO happy that people are as excited about this as they are. I honestly expected that this project would be a tiny little obscure project I worked on for a while and no one else used before I personally got bored maintaining the list and abandoned it.

I'm going to attempt to address most of your points but the short version is that generally I'm hoping to keep this addon relatively focused for the time being.

At the moment this list is not meant to be a stamp of quality. It is simply meant to filter out a brand that might not be there tomorrow. If a brand makes crap products but sticks around then it is on them to sink or swim on whatever reputation they develop. I think something like fakespot is a better arbiter of who is "good" and who is "bad." I'm here to tell you who is "real."

AI/ML is outside my wheelhouse right now and I think quite out of scope in the short term. I would be interested in trying something but that will be well after v1.0

Much of the rest of these are things that would be address by the features I want to implement but currently only have in mind. I will probably publish a roadmap sometime soon.

Some of the things I have in mind:

But this is all pretty far in the future. Right now I have an addon that can't find brands that have spaces in their names. Once I get to v1.0 we can start thinking about the moonshots.